Installation

To install StatInsight on your computer, just follow the simple steps below. The process is quick and easy, even if you're not very technical. If anything goes wrong, feel free to contact us for help.

Here is an example of the app installation for the Windows Operating System (OS).

Step 1: Go to the Download page and click the Windows icon. Then press the yellow Download button on the OS you desire.

Step 2: Save the file named Win_StatInsight_Installer.exe when prompted. If Windows SmartScreen shows a warning, click More info and then Run anyway to allow the installer from an unknown publisher.

Step 3: When the installer starts, read the License Agreement, select I accept the agreement, and click Next to continue.

Step 4: Follow the remaining steps in the setup. When it says the installation is complete, leave Launch StatInsight checked and click Finish.

Step 5: The app will open. It will show the license agreement againβ€”just press Accept to continue.

Step 6: Enter the License Key you acquired, if not, please get yourself a key here. After that, you're ready to start using StatInsight.


πŸ›  Need help?
If you run into any issues, please contact us β€” we're happy to help!

Loading Data

StatInsight supports the most common file formats used in research and data analysis. Simply open your file and the app takes care of the rest β€” no manual configuration needed.

Supported File Formats

  • CSV (.csv) β€” the most widely used tabular format. StatInsight automatically detects the delimiter used in your file (comma, semicolon, pipe, or tab), so you don't need to specify it.
  • Excel (.xls, .xlsx) β€” both legacy and modern Excel formats are fully supported.
  • RTF (.rtf) β€” Rich Text Format files containing tabular data are automatically converted and parsed.

Automatic Variable Classification

Once a file is loaded, StatInsight automatically analyzes each column and assigns it a variable type. The five supported types are:

  • Continuous β€” numeric data with many unique values (e.g. measurements, test scores, weights).
  • Categorical β€” data with a limited set of distinct groups (e.g. treatment group, country, blood type).
  • Binary β€” columns with exactly two distinct values (e.g. yes/no, male/female, 0/1).
  • Date β€” date or time values. Common formats are recognized automatically (YYYY-MM-DD, DD-MM-YYYY, MM/DD/YYYY, and others).
  • Label β€” high-cardinality text columns such as names or IDs. These are excluded from statistical analysis and are used for display or identification only.

If the auto-detected type doesn't match your intent, you can change it at any time. Switching between Continuous, Categorical, and Label is supported from the variable panel.

Saving and Reopening Projects

StatInsight lets you save your entire working session β€” including loaded data, variable settings, and all analyses performed β€” as a .stati project file. Reopening a project restores everything exactly as you left it.

Tip: For best results, make sure your file has a clean header row as the first row, with data values starting immediately below. Avoid merged cells, blank leading rows, or multi-level headers.

Descriptives

Once you load a data file into StatInsight, the app will immediately analyze it and generate descriptive statistics for all the detected variables.

StatInsight automatically classifies each variable based on its content. The supported types are Continuous, Categorical, Binary, Date, and Label.

Depending on the variable type, different descriptive summaries are provided:

  • Continuous: Mean, standard deviation (STD), median, minimum, maximum, and histograms. Additionally, a normality analysis is performed.
  • Categorical / Binary: Number of categories and counts for each, along with bar plots.
  • Date: Minimum and maximum dates, plus a timeline distribution plot.

In the "Data Summary" pane, you'll also find an option to remove outliers. This feature is available only for continuous variables and allows you to filter out values that fall outside the typical range.

StatInsight uses a standard method to identify outliers, based on the interquartile range (IQR). The formula used is:

Outlier if:
value < Q1 − 1.5 × IQR
or
value > Q3 + 1.5 × IQR

Where Q1 is the first quartile, Q3 is the third quartile, and IQR is the interquartile range (Q3 − Q1).

Additional Notes

  • Missing Variables: If a variable has no data (only a name), it will be excluded from the analysis.
  • Variable Types: Variable types are auto-detected, but you can change them if needed. You can switch between Continuous, Categorical, and Label types.
  • Label Variables: These are not included in statistical analysis and are used mainly for display or grouping purposes.
  • Normality Analysis: For the normality analysis of the continuous variables a series of tests are performed. The calculation is based on 4 factors: Shapiro-Wilk Test, Anderson-Darling Test, Skewness Analysis and Kurtosis Analysis. If the majority of tests suggest normality, then normal distribution is assumed.

Quick Statistics

One of the most powerful features in StatInsight is the "Quick Statistics" function. This tool is designed to automatically explore potential relationships between all pairs of variables in your dataset.

Depending on the variable types involved, StatInsight will apply appropriate statistical tests, including comparisons (e.g., mean differences), correlations (e.g., Pearson or Spearman), and association analyses (e.g., Chi-Squared tests for categorical data).

The goal is to give you a fast, high-level overview of how different variables might be relatedβ€”without needing to configure anything manually.

Important Notes:

  • Date variables are excluded from Quick Statistics, as they require specialized analysis.
  • No assumptions are made about causality or dependency between variablesβ€”Quick Statistics is purely exploratory.
  • If you're looking to perform hypothesis-driven or dependent-variable analysis, you should use the Custom Statistics section instead.

Custom Statistics

While Quick Statistics explores all variable pairs automatically, Custom Statistics gives you full control. You choose exactly which variables to analyze and which statistical test to apply β€” making it the right tool for hypothesis-driven research.

To run a custom analysis, select your variables of interest, pick the desired test from the list, and StatInsight will compute the results and generate a matching visualization.

Every test result includes:

  • Test statistic and p-value, with significance assessed at α = 0.05
  • Effect size (Cohen's D, correlation coefficient, odds ratio, etc. β€” depending on the test)
  • Statistical power and recommended sample size to achieve 80% power
  • Confidence intervals and group-level summary statistics
  • A plain-language interpretation of the result
  • A visualization tailored to the chosen test

Comparison Tests

Use these tests when you want to compare a continuous variable across two or more groups defined by a categorical or binary variable.

  • T-Test β€” compares the means of two independent groups. Assumes both groups are normally distributed.
  • Paired T-Test β€” compares two related measurements from the same subjects (e.g. before and after an intervention).
  • Mann-Whitney U β€” a non-parametric alternative to the T-Test, used when normality cannot be assumed.
  • Wilcoxon Signed-Rank β€” the non-parametric equivalent of the Paired T-Test for related samples.
  • ANOVA β€” compares means across three or more groups when data is normally distributed. Includes pairwise post-hoc comparisons.
  • Kruskal-Wallis β€” a non-parametric alternative to ANOVA. Includes Dunn's post-hoc test for pairwise comparisons.
  • Repeated Measures ANOVA β€” analyzes changes in a continuous variable across multiple time points within the same subjects.
  • Two-Way ANOVA β€” examines the effects of two categorical factors β€” and their interaction β€” on a continuous outcome.
  • Friedman Test β€” a non-parametric alternative to Repeated Measures ANOVA.
  • ANCOVA β€” extends ANOVA by controlling for the influence of one or more continuous covariates.
  • MANOVA β€” compares multiple continuous outcome variables simultaneously across groups.

Correlation Tests

Use these tests to measure the strength and direction of the relationship between two continuous variables.

  • Pearson Correlation β€” measures the linear relationship between two normally distributed continuous variables.
  • Spearman Correlation β€” a rank-based measure of monotonic association, suitable when normality cannot be assumed or data is ordinal.
  • Partial Pearson Correlation β€” measures the linear relationship between two variables while controlling for the effect of one or more covariates.
  • Partial Spearman Correlation β€” the rank-based equivalent, also controlling for covariates.

Categorical Analysis

Use this test to examine associations between two categorical or binary variables.

  • Chi-Squared Test β€” tests whether there is a statistically significant association between two categorical variables. Produces a full contingency table with observed and expected counts.

Survival Analysis

Use survival analysis when your outcome is the time until an event occurs (e.g. death, relapse, failure), and some subjects may not have experienced the event yet (censored data).

  • Kaplan-Meier Analysis β€” estimates the probability of survival over time for one or more groups. Includes survival curves, event and survival tables, and log-rank tests for group comparisons.

Regression Models

Use regression models when you want to predict or explain an outcome based on one or more predictor variables.

  • Simple Linear Regression β€” models the relationship between one predictor variable and a continuous outcome.
  • Multiple Linear Regression β€” extends to multiple predictor variables for a continuous outcome. Returns coefficients, standard errors, and model fit statistics.
  • Multiple Logistic Regression β€” predicts the probability of a binary outcome based on multiple predictors. Returns odds ratios and confidence intervals.
  • Multiple Multinomial Regression β€” predicts a categorical outcome with three or more categories from multiple predictors.

AutoPrediction

AutoPrediction (also called Find Predictor) is a unique feature in StatInsight that automatically identifies which variables in your dataset are the most likely predictors of a chosen outcome.

Rather than running individual tests one by one, AutoPrediction evaluates all available variables simultaneously and ranks them by their influence on the target outcome. This is especially useful when you have many variables and are not yet sure where to focus your analysis.

How it works

StatInsight uses an ensemble of three complementary methods to score each variable. The combination reduces bias from any single approach and produces a more reliable ranking:

  • L1-Regularized Regression (Lasso) β€” identifies variables with a linear relationship to the outcome. Variables with zero coefficients are effectively excluded.
  • Mutual Information β€” measures general statistical dependency between each variable and the outcome, including non-linear associations.
  • Random Forest Feature Importance β€” captures complex, non-linear interactions that simpler methods may miss.

The results are presented as a ranked list of predictor variables, ordered from most to least influential, along with a visualization of their relative importance scores.

When to use AutoPrediction

  • You have a specific outcome variable in mind and want to know which other variables are associated with it.
  • You are in an early exploratory phase and need to narrow down a large variable set before running formal tests.
  • You want a data-driven starting point for model building or hypothesis generation.

Note: AutoPrediction is designed for exploration and variable screening β€” it does not replace confirmatory statistical tests. Use the results to guide your Custom Statistics analyses.

Filters & Navigation

StatInsight also offers a powerful and user-friendly filtering and navigation system to help you quickly find the data you care about.

After uploading your dataset, the Descriptives section allows you to filter variables by type (e.g., continuous, categorical, binary) or search by name. You can even enter partial names to instantly narrow down the list. This makes it easy to locate and explore specific variables in large datasets.

Additionally, in the Statistics panes, there's a second layer of filtering where you can search for specific statistical tests or results. A particularly helpful feature is the ability to filter by significant p-values, so you can focus on the tests that show meaningful differences or correlations.

By default, StatInsight highlights results with a p-value < 0.05 as statistically significant.

Next to the navigation and filter buttons, you'll also see a summary of the active filter, showing the current search criteria. By default, no filters are applied, so all variables and statistics are displayed.


Plots

Every analysis in StatInsight is accompanied by a visualization tailored to the data and test type. Plots are generated automatically alongside results β€” no extra steps required.

Descriptive Plots

Generated automatically in the Descriptives section for each variable:

  • Histogram β€” for continuous variables, with automatic binning
  • Bar plot β€” for categorical and binary variables, showing category counts
  • Pie chart β€” for categorical variables, showing proportions
  • Box plot β€” for continuous variables, showing spread and outliers
  • Time-area plot β€” for date variables, showing the distribution over time

Statistical Test Plots

Generated alongside each statistical result in Quick Statistics and Custom Statistics:

  • Scatter plot with trendline β€” for correlation tests (Pearson, Spearman)
  • Mean bar chart β€” for T-Test, ANOVA, Two-Way ANOVA
  • Mean line plot β€” for Paired T-Test and Repeated Measures ANOVA
  • Box plot β€” for Mann-Whitney U, Wilcoxon, Kruskal-Wallis
  • Violin plot β€” for distribution comparisons across groups
  • Stacked bar plot β€” for Chi-Squared / categorical association tests
  • Group scatter plot β€” for ANCOVA, showing groups relative to the covariate
  • Kaplan-Meier survival curves β€” for survival analysis, with at-risk tables
  • Regression scatter plot β€” for linear and logistic regression models
  • Feature importance bar chart β€” for AutoPrediction results

Plot Customization

Each plot can be customized directly from within the app using the built-in Plot Editor. You can modify:

  • Plot title and axis labels
  • Colors, patterns, and line styles
  • Plot dimensions and aspect ratio

Plots are exported automatically when you export your results to Word β€” each exported result includes its corresponding visualization.

Exports

StatInsight can export your results as a formatted Word document (.docx), ready for use in reports, publications, or presentations. Both descriptive summaries and statistical test results can be exported.

Descriptive Export

Exports the Descriptives section to a Word document. For each variable, the document includes:

  • A section heading with the variable name and type
  • A formatted statistics table (metric and value pairs)
  • The corresponding plot or histogram image

Statistics Export

Exports selected or all results from Quick Statistics or Custom Statistics. For each test, the document includes:

  • A heading identifying the variables analyzed and the test used
  • A brief explanation of what the test measures
  • Full numerical results (test statistic, p-value, effect size, power)
  • Pairwise comparison tables where applicable (e.g. ANOVA post-hoc, Kruskal-Wallis)
  • Regression summary tables (coefficients, odds ratios, confidence intervals)
  • Kaplan-Meier survival and event tables for survival analyses
  • The test visualization
  • A plain-language summary of the finding

Export Options

  • Export all β€” includes every result currently loaded
  • Export filtered β€” exports only the results currently visible after applying a filter (e.g. significant results only)

Exports run in the background with a progress indicator so you can continue working while the document is being generated. Large exports with many results are split across multiple files automatically.

Tip: Use the Filters & Navigation feature to narrow results to significant findings before exporting β€” this keeps your documents focused and easy to review.