AI-driven Platform Identifies and Remediates Biases in Data
Synthesized has released the Community Edition of its data platform for Bias Mitigation. Released as a freemium version, the offering incorporates AI research and cutting-edge techniques to enable any organization to quickly identify potential biases within their data and immediately start to remediate these flaws.
The platform was designed by the London-based firm to understand a wide array of regulatory and legal definitions regarding contextual bias. It can automatically identify bias across data attributes like gender, age, race, religion, sexual orientation, and more. It specializes in the creation of synthetic data, as the name suggests, which is data that is artificially generated. Using AI models, it constructs a new, entirely synthetic data set from the original information, one that is highly statistically accurate (up to 95%) but crucially does not reveal customers’ personally identifiable information (PII) and it works with a host of financial organizations due to this.
Synthesized is making the capability available immediately, requiring no coding or deep technical expertise to get started. Users simply upload a structured data file, like a spreadsheet, to kick off the analysis process. The inherent simplicity of the platform allows for the solution to span industries. The data platform could be used in finance to create fairer credit ratings, in insurance to better assess claims more equitably, in human resources to identify bias as part of a hiring process and in universities to ensure that admission decisions are fair.
“The reputational risk of all organizations is under threat due to biased data and we’ve seen this will no longer be tolerated at any level,” said Dr Nicolai Baldin, CEO and founder of Synthesized. “It’s a burning priority now and must be dealt with as a matter of urgency, both from a legal and ethical standpoint. Synthesized’s Community Edition for Bias Mitigation is one of the first offerings specifically created to understand, investigate, and root out bias in data. We designed the platform to be very accessible, easy-to-use and highly scalable, as organizations have data stored across a huge range of databases and data silos.”
Rebalancing Biased Data
Beyond this deep analysis and bias detection, the platform also offers another extremely powerful feature: to automatically remove the biases present in an entire data set in a process called rebalancing.
While there are a number of existing, limited techniques to rebalance biased data, Synthesized has developed a proprietary algorithm within its platform that is quicker and more accurate. The AI-driven platform has the ability to make randomized changes, at scale, to an original, biased data set to construct a new, entirely synthetic data set. With the generation of synthetic data, Synthesized’s platform gives its users the ability to equally distribute all attributes within a data set to remove bias and rebalance the data set completely. Users can also manually change singular data attributes within a data set, such as gender, providing granular control of the rebalancing process.
Community Edition for Bias Mitigation – How It Works
- Free sign up: Your organization can sign up.
- Easy to get started: Upload a structured data file, like an Excel spreadsheet, to kick off the analysis process. Users can also connect to relational database services including AWS, Azure, Google Cloud, Oracle, and others, to build custom datasets for analysis. The platform learns the structure of the data in real-time, and the analysis process can crunch over four million rows of data in roughly ten minutes.
- Bias summary and score: Once the analysis is complete, users are provided with a Synthesized Total Fairness Score that shows what percentage of the data set contained biased data. The platform also highlights areas of the data in which bias was detected.
- Rebalancing: As mentioned, the final feature available in this process is the ability to automatically rebalance biased data.
Synthesized’s Complete Solution
The Community Edition is one part of Synthesized’s data platform. The complete platform uses AI to automate all stages of data provisioning; the process of making data available in an orderly and secure way. This level of automation enables organizations to generate synthesized datasets, allowing them to better test data for new products and tools, validate mathematical models, or train machine learning models.
Synthesized completely removes the heavy and costly burden of finding, collecting, and preparing data. Gartner estimates that data scientists and test engineers currently waste up to 80% of their valuable time on such repetitive tasks. Synthesized’s data platform helps organizations to finally unlock and maximize true value of data.
The company was founded in 2017 by Dr Nicolai Baldin during his transition from academia to working with public bodies in the UK. While pursuing his PhD in Statistics and Machine Learning at the University of Cambridge, he identified the significant gap in the advancements made by the scientific community and those made by major organizations, and created a platform to bridge this gap.
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1
Read More …