
Don’t neglect the fundamentals
I firmly believe that a strong grasp of the basics in one’s field is indispensable. Yet, this belief often conflicts with the current culture in data and analytics, which emphasizes rapid results. This culture is driven by tools designed for easy access and quick creation of analytical assets like reports and dashboards.
To put this in context: when I began my career in BI a decade ago, my first three projects revolved around developing data warehousing solutions with reporting layered on top. This involved building an Operational Data Store (ODS), tracking changes when necessary, and designing a dimensional model tailored to business needs. Understanding the distinction between dimensions and facts was essential. The dimensional model would then be “de-normalized” to present a cohesive and comprehensive view to the business.
With the advent of GUI-based tools for data preparation and dashboarding, I see that this rigorous approach is often overlooked, potentially undermining the robustness and reliability of the analytics produced. Here are two examples that illustrate this point:
Example 1: Overreliance on New Technologies
In one project, a client adopted a new data processing platform touted for its ability to handle large data volumes with minimal code optimization. However, a few months in, performance issues began to surface. Fortunately, some team members had an “old school” mindset and paid attention to foundational practices like indexing and optimizing join and filtering clauses. They discovered that the analytical processes were filtering on a non-indexed data column. By adding an index, the load time of this table dropped from 15 minutes to just 2 minutes. Similarly, replacing joins and filters on value columns with ID columns significantly improved performance. The takeaway: no technology can replace the need for a solid understanding of fundamentals. While modern tools are powerful, knowing the difference between columnar and row-oriented databases — and why the latter isn’t suited for analytics — can make all the difference.
Example 2: Building Credibility Through Fundamentals
Understanding the core concepts of data and analytics not only bolsters technical skills but also enhances career growth. This year, I had discussions with several Chief Data Officers (CDOs) who initially saw me as just another analytics person — who creates dashboards and reports. However, my ability to discuss the underlying data model and propose enhancements to make it more robust led to increased trust. This credibility allowed me to unlock new opportunities and projects.
With that in mind, I would like to share a list of essential topics for data analytics professionals to explore during their first 10 days on the job. Hope you find it useful.
Before Accessing Data
- Understand what structured (tabular) data is.
- Learn about data storage formats: databases, spreadsheets, text files, etc.
- Recognize the difference between transactional and analytical data, as well as their storage and processing methods.
- Master SQL (Standard Query Language). Learning any SQL dialect is beneficial as most adhere to the ANSI standard.
- Familiarize yourself with other data manipulation tools like Excel — outdated yet widely used and valuable.
Before Analyzing Data
- Learn how to aggregate data.
- Understand data filtering techniques.
- Study how to join datasets and when to do so.
- Learn how and when to union datasets.
- Learn how to determine the correct sequence for performing operations (e.g., should you aggregate data before or after joining datasets?).
Bonus: Converting raw data to analytical data
- Recognize what raw or transactional data is.
- Learn the principles of dimensional modeling, including Primary keys, Granularity, Normalization and de-normalization
Join The Simplicity Stack
The unactionable newsletter. For people tired of doing everything.