holds a vast amount of raw data in its native format until it is needed. - http://searchaws.techtarget.com • A data lake is a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. The data structure and requirements are not defined until the data is needed. - Tamara Dull, (SAS), https://www.kdnuggets.com • It store the data in its native/ raw format • The schema applied when on query time • Sometimes it’s also just a “marketing label” to simplified people saying the technology which complied with Hadoop, just like “big data” terms for distributed storing and query engine Data Lake by Definitions