In this contributed article, Tom Scott, CEO of Streambased, outlines the path event streaming systems have taken to arrive at the point where they must adopt analytical use cases and looks at some possible futures in this area.
Navigating Data Lake Challenges: Governance, Security, and GDPR Compliance
In this contributed article, Coral Trivedi, Product Manager at Fivetran, discusses how enterprises can get the most value from a data lake. The article discusses automation, security, pipelines and GSPR compliance issues.
Why Do We Prefer ELT Rather than ETL in the Data Lake? What is the Difference between ETL & ELT
In this article, Ashutosh Kumar discusses the emergence of modern data solutions that have led to the development of ELT and ETL with unique features and advantages. ELT is more popular due to its ability to handle large and unstructured datasets like in data lakes. Traditional ETL has evolved into cloud-based ETL which allows rapid batch processing, scalability, savings, and simplicity while maintaining security, governance, and compliance.
Data Virtualization’s Ubiquity: Data Meshes, Data Products, Data Lake Houses, Data Fabrics
In this contributed article, editorial consultant Jelani Harper discusses how data virtualization is the underlying technology for some of the most progressive architectures today, including that of the data mesh, data lake house, and data fabric. Although it’s still regarded as a desirable, dynamic means of integrating data, it’s silently reshaping itself into something that encompasses this attribute but, ultimately, is much more.
Real-Time Analytics from Your Data Lake Teaching the Elephant to Dance
This whitepaper from Imply Data Inc. explains why delivering real-time analytics on a data lake is so hard, approaches companies have taken to accelerate their data lakes, and how they leveraged the same technology to create end-to-end real-time analytics architectures.
Real-Time Analytics from Your Data Lake Teaching the Elephant to Dance
This whitepaper from Imply Data Inc. introduces Apache Druid and explains why delivering real-time analytics on a data lake is so hard, approaches companies have taken to accelerate their data lakes, and how they leveraged the same technology to create end-to-end real-time analytics architectures.
Introducing Apache Druid
This whitepaper provides an introduction to Apache Druid, including its evolution,
core architecture and features, and common use cases. Founded by the authors of the Apache Druid database, Imply provides a cloud-native solution that delivers real-time ingestion, interactive ad-hoc queries, and intuitive visualizations for many types of event-driven and streaming data flows.
Okera Delivers Industry’s First Real-Time Actionable Insights into Data Lakes
Okera, a leading active data management platform that enables companies to discover, audit, and protect data at scale, announced Okera Spotlight for Amazon Web Services (AWS) users, the first and only solution to provide full visibility with real-time and continuous audit of your Amazon Simple Storage Service (S3) data lake.
Data Lakes: Big Data Quarterly
As with any major IT initiative, cost-savings drives many data lakes initiatives. However, the value will ultimately be realized in the potential avenues it offers for business growth. The next frontier for data lakes is providing organizations with greatly enhanced analytical opportunities.
Data Lakes Principles and Economics
This Checklist Report discusses what your enterprise should consider before diving into a data lake project, no matter if it’s your first or second or even third major data lake project. Presumably, adherence to these principles will become second nature to the data lake team and they will even improve upon them at some point.