In Spark, we have done reasonable well historically in interface and API design, especially compared with some other Big Data systems. However, we have also made mistakes along the way. I want to share a talk I gave about interface design at Databricks' internal retreat.
Interface design is a vital part of Spark becoming a long-term sustainable, thriving framework. Good interfaces can be the project's biggest asset, while bad interfaces can be the worst technical debt. As the project scales bigger and bigger, the community is expanding and we are getting a wider range of contributors that have not thought about this as their everyday development experience outside Spark.
It is part-art part-science and in some sense acquired taste. However, I think there are common issues that can be spotted easily, and common principles that can address a lot of the low hanging fruits. Through this presentation, I hope to bring to everybody's attention the issue of interface design and encourage everybody to think hard about interface design in their contributions.