Blake stitch VS Goodyear Welt 2
We asked DataChat’s co-founder, Rogers Jeffrey Leo John, to tell us about the origins of th...
We asked DataChat’s co-founder, Rogers Jeffrey Leo John, to tell us about the origins of the company. What was missing from the marketplace? How has DataChat pushed the envelope?
The idea formed after Jignesh [Patel, our CEO and co-founder] was a visiting scientist at Pivotal Labs. He observed their data teams and the problems they were trying to solve. He noticed that most of the problems (and their solutions) followed similar patterns: when training a model, they ended up in a loop of loading the Python package, selecting features, training the model, then analyzing the results. This was followed by tweaking the features and retraining the model over again.
With that observation in mind, we wrote our first paper. In that paper, we suggested an early prototype of Ava, our Conversational Intelligence assistant, that could abstract the model training loop into a Python template.
We realized that, by leveraging controlled natural language (CNL), we could abstract away the programming languages (Python, R, SQL, etc.) from the user in favor of a subset of English. That was the genesis of DataChat’s Guided English Language© (GEL), which was inspired by the “language” used by aviators, such as the NATO phonetic alphabet. GEL allows the user to build data science workflows without needing to know Python, R, SQL, or any other traditional data science tool.
We spun our research out into DataChat and have been growing and evolving ever since.
While developing Ava and GEL to make model training more intuitive, we’ve also expanded GEL to cover a wide array of data science tools and functions, including data ingestion, data wrangling, and visualization, along with machine learning and explainable artificial intelligence. This makes us a truly all-in-one platform that allows more business users to work with their own data to answer their own questions without needing to learn how to code or work with more complicated data science tools.
One problem we’re solving is the reproducibility gap. A few years ago, the industry didn’t care about reproducibility; they were more concerned with model accuracy and less concerned about how they got there. We baked reproducibility into DataChat from the beginning.
By having conversations with Ava in GEL, we’re actually automating the documentation and commenting pieces of the data science process, too. Our workflows are built in English, which makes it very easy to look back and see exactly what happened and when. This makes it easy to understand the logic behind the pipeline, but also improves governance and transparency across an organization.
Categories