Skip to main content
Guides After the offer First Week Playbook for Data Scientists: Access, Baselines, Stakeholders
After the offer

First Week Playbook for Data Scientists: Access, Baselines, Stakeholders

8 min read · April 25, 2026

A concrete first-week plan for data scientists in 2026: get data access on day one, establish baselines, and map the stakeholders who will act on your work.

The fastest way to fail as a new data scientist is to spend three weeks building a beautiful model that nobody asked for, does not connect to a decision, and is trained on data you do not fully understand. The fastest way to succeed is to do three unglamorous things in week one: get data access sorted before noon on day one, re-derive a metric the team already looks at every Monday morning, and meet the humans who will actually act on your analyses. Everything else is downstream.

This playbook works for data scientists, analytics engineers, ML engineers on the analytics-adjacent side, and the increasingly common "product analyst" hybrid role. It applies whether you are joining a data team, embedded in a product team, or the only data person at a startup. The shape is the same.

Day 1 Is For Data Access, Not Modeling

Data access is where onboarding goes to die. You will have ten systems to request permissions to, and if you are not aggressive about it on day one, you will still be waiting on row-level access to a critical table in week three.

Before noon on day one, you need:

  1. Warehouse access — Snowflake, BigQuery, Redshift, Databricks, whatever your company uses, with at least read access to the analytics-ready schemas
  2. BI tool access — Looker, Tableau, Hex, Mode, Metabase, or an internal equivalent
  3. Git access to the dbt, Dataform, or SQLMesh repo where transformations live
  4. Notebook or IDE environment that connects to the warehouse (JupyterHub, Hex, VS Code with credentials, or whatever is standard)
  5. Access to the event stream or raw layer — Segment, Snowplow, Rudderstack, or the raw tables in the warehouse
  6. Slack, the project tracker, and whichever ticketing channel the data team uses

If any of these are blocked by end of day one, escalate to your manager. Data-scientist onboardings stall on permissions more than any other function. Your manager expects you to push.

Then run one query. Literally one: select count(*) from <core_event_table> where date >= current_date - 7. Confirm the number matches what your manager or the team's dashboard says it should be. If it does not, you have already found your first week's most valuable task: reconciliation.

Re-Derive The Team's Top Metric On Day 2

Every data team has one or two numbers that leadership looks at every Monday morning. Weekly active users, gross revenue, conversion from signup to paid, retention cohorts — something. Your onboarding goal on day two is to write the SQL that produces that number from scratch, compare your output to the official dashboard, and explain any discrepancy.

You will find one of three things:

  • Your number matches. Great. You now understand the metric at a deeper level than 80% of your colleagues.
  • Your number is close but off by a few percent. There is a definitional subtlety — maybe filtered users, bot traffic, internal accounts, timezone cutoffs. Chase it down.
  • Your number is wildly off. Either your query is wrong or the official metric is. Both outcomes are valuable.

Write up what you found. Share it in a short doc with your manager and the analytics engineer who owns the underlying model. Do not frame it as a correction. Frame it as a learning: "Here is how I rebuilt the metric, here is where my version diverged from the dashboard, here is what I think explains the difference."

If you cannot rebuild the single most important metric the team looks at from raw tables by end of week one, you are not yet ready to produce analysis on top of it. That is not a judgment — it is just how the craft works.

Map The Data Surface Area

Every company has a data environment that is part-documented, part-tribal-knowledge. Your goal in week one is to get the 80/20 mental model.

Specifically, identify:

  • The 5-10 tables that show up in the most dashboards and queries. In dbt, this is often the marts/ layer or whatever is tagged exposed. Find them.
  • The 3-5 upstream sources that feed those tables — the production database replicas, the Stripe or billing data source, the product event stream, the CRM sync.
  • The event taxonomy. If the company uses Amplitude or Mixpanel or a warehouse-native approach, find the canonical event list. Read it. It will tell you more about how the product is instrumented than any onboarding doc.
  • The freshness and reliability story. How often does each core table update? When it breaks, who notices? Check the dbt or Airflow or Dagster run history for the last 30 days.

Write this as a one-page data map. You will use it every week for the rest of your tenure.

Do Eight Conversations In Five Days

Book 30-minute 1:1s with:

  1. Your manager (probably already booked, make it recurring)
  2. The analytics engineer or data engineer you will depend on most
  3. One peer data scientist, ideally the longest-tenured on the team
  4. The PM of the product area you will support
  5. One engineer on the team whose events you will be analyzing
  6. The BI or analytics lead if separate from your manager
  7. A customer-facing person — head of Support, CS lead, or a senior sales person — who can tell you what customers complain about
  8. Your skip-level (director or VP of Data, or the CTO if data reports into engineering)

Use these three questions in every meeting:

  • What is the best piece of analysis anyone on the data team shipped in the last year, and what made it work?
  • What is a question you keep asking the data team that never gets answered well?
  • If you were me, what would you dig into first?

The first question tells you what good looks like here. The second tells you exactly where the undelivered value is. The third tells you what the team has already agreed needs doing — and that is your highest-signal quick win.

Establish Baselines Before You Model

The first instinct of a strong data scientist joining a new role is to find the fanciest problem and throw an XGBoost model at it. Resist. In week one, your job is baselines.

For whatever product area you are supporting, document:

  • The current conversion rate at every step of the primary funnel, with confidence intervals
  • The current retention curve, cohorted by signup month, for the last 12 months
  • The current segment sizes — how many users in each plan tier, geography, company size, or whatever the primary segmentation is
  • The base rates for any event or action your team cares about — "what percent of active users do X per week"

Write these numbers down in a shared doc. They become the baseline against which every future experiment, model, or proposed change is judged. A data team that does not have these documented is a data team that relitigates the same questions every quarter.

This is also where you earn early credibility. Many companies have three different numbers for "weekly active users" floating around in different Slack threads. A new hire who publishes the canonical baseline doc in week two is a new hire who gets pulled into strategic conversations in month two.

Ship One Analysis By Friday

Your first analysis does not need to be a capital-I insight. It needs to be shipped. Good candidates:

  • A dashboard that the team has been asking for in Slack but nobody has built
  • A reconciliation of two conflicting metric definitions, with a recommendation
  • A quick exploratory analysis of a user segment your PM has been curious about
  • A data quality audit of one core table, with counts of nulls, duplicates, and anomalies

Write it up in a short doc — half a page of context, the chart or number, and two or three bullet points of what it means and what to do. Share it in the team Slack channel. Do not bury it in Notion where no one will read it.

The point is to exercise the full pipeline: query, analyze, visualize, write up, share, field questions, revise. You will discover every broken thing in your environment on the first loop.

What To Avoid In Week One

A short list of mistakes:

  • Starting a predictive model before you have established baselines. You are guessing at what matters.
  • Rebuilding a dashboard that already exists because you do not like how it looks. Not your call yet.
  • Ignoring the analytics engineers. They own the tables you depend on; their priorities are your ceiling. Befriend them.
  • Presenting analysis with no recommendation. A chart without a "so what" is not analysis. It is homework.
  • Using fancy methods when simple ones work. A cohort retention curve delivered on Thursday beats a causal inference paper delivered in six weeks.

Next Steps

  1. Before noon on Monday, confirm warehouse, BI tool, Git, notebook environment, and event stream access. Run one sanity-check query. Escalate anything blocked.
  2. By end of day Tuesday, rebuild the team's top metric from raw tables and reconcile your number against the canonical dashboard. Share a short writeup with your manager and the analytics engineer who owns the underlying model.
  3. By Wednesday, publish a one-page data map — core tables, upstream sources, event taxonomy, freshness. This document is for you first and the team second.
  4. By Thursday, have eight 1:1s booked across week one and week two. Use the three listening-tour questions. Keep a running doc of patterns.
  5. By end of day Friday, ship one analysis or dashboard to the team Slack channel. Keep it short, keep it shipped, and start week two with feedback rather than a draft.