Showcase examples

The Showcase contains the following examples, grouped by analytic.

Predict Numeric Fields

Algorithm: Linear Regression

Example	Dataset	Description
Predict Server Power Consumption	Server power (server_power.csv)	Predicts the power usage of a machine based on other metrics such as CPU utilization and memory transactions.
Predict VPN Usage	App statistics (apps.csv)	Predicts the VPN usage of employees based on the frequency of use of other apps.
Predict Median House Value	Housing (housing.csv)	Predicts median home value in a region based on housing value-related predictor fields.
Predict Power Plant Energy Output	Power plant humidity (power_plant.csv)	Predicts the power output of the power plant given other measured variables, such as ambient temperature and humidity.

Algorithm: Logistic Regression

Example	Dataset	Description
Predict Hard Drive Failure	Disk failures (disk_failures.csv)	Predicts whether the hard drive is going to fail based on various indicators of drive reliability.
Predict the Presence of Malware	Firewall traffic (firewall_traffic.csv)	Predicts whether the firewall is going to be affected by malware or has a vulnerability or not based on various traffic indicators on the firewall.
Predict Telecom Customer Churn	Churn (churn.csv)	Predicts whether a customer will change providers (denoted as churn) based on the usage pattern of customers.
Predict the Presence of Diabetes	Diabetes (diabetes.csv)	Predicts response in diabetes data.
Predict Vehicle Make and Model	Track day (track_day.csv)	Predicts the vehicle type given other onboard metrics.

Algorithm: Distribution statistics

Example	Dataset	Description
Detect Outliers in Server Response Time	Server response time (hostperf.csv)	Detects outliers in server response time.
Detect Outliers in Number of Logins (vs. Predicted Value)	Employee logins (logins.csv)	Forecasts the number of logins by hour and identify when the actual number of logins differs significantly from our forecast.
Detect Outliers in Supermarket Purchases	Supermarket purchases (supermarket.csv)	Detects outliers in the quantity of purchases at a supermarket.
Detect Outliers in Power Plant Humidity	Power plant humidity (power_plant.csv)	Detects outliers in humidity of a power plant.

Algorithm: Probabilistic measures

Example	Dataset	Description
Detect Outliers in Disk Failures	Disk failures (disk_failures.csv)	Detects categorical outliers in disk failure data.
Detect Outliers in Bitcoin Transactions	Bitcoin transactions (bitcoin_transactions.csv)	Detects outliers in bitcoin transactions that may reflect unusual activity.
Detect Outliers in Supermarket Purchases	Supermarket purchases (supermarket.csv)	Detects outliers in the whole transaction at a supermarket.
Detect Outliers in Mortgage Contracts	Mortgage loans for New York (mortgage_loan_ny.csv)	Detects outliers in mortgage loans in New York.
Detect Outliers in Diabetes Patient Records	Diabetic data (diabetic.csv)	Detects outliers in diabetic data.
Detect Outliers in Mobile Phone Activity	Phone usage (phone_usage.csv)	Detects outliers in the number of calls that are incoming, outgoing, or missed from various phones.

Algorithm: State-space Method using Kalman Filter

Example	Dataset	Description
Forecast Internet Traffic	Internet traffic (internet_traffic.csv)	Forecasts the peak and off-peak times of internet usage given a few full cycles of internet traffic history.
Forecast the Number of Employee Logins	Employee logins (logins.csv)	Forecasts the number of logins by hour.
Forecast Monthly Sales	Souvenir sales (souvenir_sales.csv)	Forecasts the number of souvenir sales by month for a Souvenir Shop.
Forecast the Number of Bluetooth Devices	Bluetooth devices (bluetooth.csv)	Forecasts the number of distinct Bluetooth contacts that are made to the access points placed in the busiest lecture halls on the campus of the National University of Singapore.
Forecast Exchange Rate TWI using ARIMA	Exchange Rate TWI (exchange.csv)	Forecasts the trade weighted index of a currency

Algorithms: K-means, DBSCAN, Spectral Clustering, BIRCH

Example	Dataset	Description
Cluster Hard Drives by SMART Metrics	disk_failures.csv	Clusters hard drives based on the self-monitoring metrics they generate.
Cluster Behavior by App Usage	app_usage.csv	Clusters the behavior of employees based on how frequently they use business applications like Webmail or VPN.
Cluster Neighborhoods by Properties	housing.csv	Clusters neighborhoods based on properties like crime rate and median house value.
Cluster Vehicles by Onboard Metrics	track_day.csv	Clusters vehicles driven on a racetrack by onboard metrics like engine temperature and G-forces.
Cluster Power Plant Operating Regimes	power_plant.csv	Clusters the operating regimes of a power plant based on ambient measurements like temperature and vacuum.