2 min read
Informing Unstructured Data Processing in BigQuery ML

Why

The BigQuery Machine Learning (BQML) team needed to evaluate how unstructured data—such as images, PDFs, videos, music, and handwritten notes—could be effectively processed to generate actionable insights. BigQuery clients had repeatedly requested unstructured data support within Google Cloud Storage. To meet this demand, the team prioritized research into real-world use cases and potential implementation pathways before introducing the capability, ensuring product design aligned with user needs.


How

The study engaged 15 subject matter expertsacross diverse industries, including healthcare, vaccine manufacturing, government agencies, social media, music, and security/surveillance. Each 60-minute interview explored:

  • Current practices for handling unstructured data.
  • Tools and platforms in use.
  • Recurring challenges in storage and metadata.
  • Best practices adopted within and across industries.

This deep dive provided a cross-section of perspectives on the complexity of managing unstructured data at scale.


Findings

  • Metadata & Storage Challenges: Participants consistently highlighted difficulties in choosing the right storage solutions and managing metadata effectively.
  • Terminology Confusion: The interchangeable use of terms and brand names across regions and industries led to significant misalignment, slowing adoption.
  • Process Gaps: Many workflows lacked clarity, with ad-hoc practices limiting efficiency and scalability.

Impact

The research gave the BQML team a comprehensive view of unstructured data use cases. It identified metadata management and tagging as critical priorities for product development. These insights directly informed product direction, shaping design choices for unstructured data processing in BigQuery and laying the foundation for future enhancements in usability and adoption.



* Diagram generated in Lucid charts to understand the various terms/products utilized in the UX research study by external Data experts. The ETL process was created as a diagram & the various tools plugged in per their function in the process.*