top of page

Synthetic Data Overview

What Is Included

A ZIP file containing your data, along with a readme file, disclaimer, and license, is included in the downloaded package. This ensures you have all the necessary information and permissions to effectively use and understand the synthetic patient data.

How Can I Use The Data?

You can use the data for:

  • Educational Projects: Learn data analysis techniques and disease modeling.

  • Research Simulations: Study patterns and interactions in synthetic data.

  • Algorithm Testing: Validate machine learning algorithms and models.

  • Training and Workshops: Demonstrate data handling and analysis methods.

Please note, the data is not suitable for clinical decision-making or predicting real patient outcomes.

What The Data Is

The downloaded file contains rows (samples) and columns (features) describing fictitious patient data that mimic the selected disease. The data values are based on reviewed research to reflect feature values and interactions. Note that some values are inferred to fill research gaps, and the data may not fully represent all disease features due to the complexity of nonlinear interactions. This data is designed to highlight the complexity of rheumatic disease and serve as a tool for data analysis exercises.

What The Data Is Not

The data is not real patient data and should not be used for clinical decision-making. It does not provide a complete or fully accurate representation of the selected disease, as some values are inferred due to gaps in research. This data is not suitable for predicting actual patient outcomes or for any medical use. It is intended solely for educational purposes and data analysis exercises.

Abstract Curves_edited.jpg

About the Algorithm 

Mutiple Data Types

Generate various data types for each sample, including:​

  • Demographic Data: Ethnicity, age, etc. (0 = does not have, 1 = has)

  • Serum Data: Blood test results in adjusted units (continuous values)

  • Impression Data: Diagnoses (0 = does not have, 1 = has)

  • Medication Data: Medications taken (0 = does not have, 1 = has)​

These data types provide a comprehensive view of synthetic patient profiles for detailed analysis and research.

Data Type Interactions

To enhance accuracy and complexity, the different data types within the dataset influence each other across groups. For example, medications assigned to a sample can impact serum data results by dynamically adjusting values to reflect the effects of the medication and associated diseases. This interaction ensures a more realistic and interconnected dataset.

Analytics

We use advanced AI to analyze the complexity of our synthetic patient data, ensuring it is both challenging and highly representative of real-world scenarios. Our causal analytics further ensure that any disruptions within the data closely mimic natural occurrences. 

Research

We have reviewed hundreds of research papers to accurately generate values for the data types, in an attempt to reflect patients with the selected disease. To learn more, you can review the full bibliography by selecting the link below.

Continue To Step 02

bottom of page