Uncovering Patterns and Anomalies in Manufacturing Data

INFO 523 - Final Project

Project description
Author
Affiliation

Cesar Castro

College of Information Science, University of Arizona

Project Description:

  • Β Uncovering Patterns and Anomalies in Manufacturing Data

🎯Goals:

The construction of modern factories is resulting in the generation of vast amounts of data. Manufacturing equipment continuously monitors various parameters, such as temperatures, vibrations, motor speeds, and energy consumption, using sensors and other methods. Variations in these parameters can indicate shifts in performance, potentially leading to defects or catastrophic failures in the equipment. Detecting these shifts has become increasingly important to reduce downtime and boost productivity.

Advanced techniques such as machine learning, anomaly detection, and image analysis are currently being utilized to forecast when equipment might require maintenance, calibration, or material changes. This project aims to leverage synthetic public data from Kaggle to compare various classification and regression models, with the objective of predicting these critical events. If time allows, we will also explore anomaly detection techniques on time series data to predict potential failures as early as possible.

πŸ“ŠProposed Datasets:

  1. Source: Kaggle - Machine Predictive Maintenance Classification (Synthetic dataset that reflects real predictive maintenance encountered in the industry)

Data Example:

UDI Product ID Type Air temperature [K] Process temperature [K] Rotational speed [rpm] Torque [Nm] Tool wear [min] Target Failure Type
0 1 M14860 M 298.1 308.6 1551 42.8 0 0 No Failure
1 2 L47181 L 298.2 308.7 1408 46.3 3 0 No Failure
2 3 L47182 L 298.1 308.5 1498 49.4 5 0 No Failure
3 4 L47183 L 298.2 308.6 1433 39.5 7 0 No Failure
4 5 L47184 L 298.2 308.7 1408 40.0 9 0 No Failure
  1. Source: Kaggle - Intelligent Manufacturing Dataset (The Intelligent Manufacturing Dataset for Predictive Optimization is a dataset designed for research in smart manufacturing, AI-driven process optimization, and predictive maintenance)

Data Example:

Timestamp Machine_ID Operation_Mode Temperature_C Vibration_Hz Power_Consumption_kW Network_Latency_ms Packet_Loss_% Quality_Control_Defect_Rate_% Production_Speed_units_per_hr Predictive_Maintenance_Score Error_Rate_% Efficiency_Status
0 2024-01-01 00:00:00 39 Idle 74.137590 3.500595 8.612162 10.650542 0.207764 7.751261 477.657391 0.344650 14.965470 Low
1 2024-01-01 00:01:00 29 Active 84.264558 3.355928 2.268559 29.111810 2.228464 4.989172 398.174747 0.769848 7.678270 Low
2 2024-01-01 00:02:00 15 Active 44.280102 2.079766 6.144105 18.357292 1.639416 0.456816 108.074959 0.987086 8.198391 Low
3 2024-01-01 00:03:00 43 Active 40.568502 0.298238 4.067825 29.153629 1.161021 4.582974 329.579410 0.983390 2.740847 Medium
4 2024-01-01 00:04:00 8 Idle 75.063817 0.345810 6.225737 34.029191 4.796520 2.287716 159.113525 0.573117 12.100686 Low

πŸ—“οΈProject Schedule

Week 1.

  • Definition of problem statement and goals

  • Plan to incorporate peer review feedback into project plan

  • Data cleaning. (handling missing, outliers, define imputation methods).

  • Define key response on the datasets, depending on the model (might want to look like defects pass/fail) for a classification model or defect rate for a regression model.Β 

Week 2.Β 

  • Analyze features (use PCA or others to understand which features contribute more the variability, etc.)

  • Classification Model Creation and Validation

  • Regression Model Creation and Validation

  • Comparing models and recommend the best one.Β 

Week 3:

  • Assess anomaly detection and recommend future possible next steps.Β 

  • Prepare final report and presentation

πŸ“Project Organization

| FINAL-PROJECT-CASTRO
| β€” πŸ“DATA: # Raw Data files obtained from Kaggle source in CSV format.
| ______|β€”- πŸ“processed: # Cleaned and processed datasets
| ______|β€”- πŸ“results: # model evaluation and other results
| β€” πŸ“IMAGES: # Any images to be used by quarto site
| β€” πŸ“presentation_files: # Quarto presentation files
| β€” πŸ“extra: # Additional documents or files used on project
| β€” πŸ“quarto: # quarto files
| β€” πŸ“src: # source code used for project
| β€” πŸ“.github: # github configuration files
| – πŸ“„requirements.txt: # Python Dependencies
| – πŸ“„_quarto.yml # quarto metadata and configuration
| – πŸ“„.gitignore # list of files and directories to be ignore by Git
| – πŸ“„about.qmd # Quarto about page with general information about the project
| – πŸ“„presentation.qmd # Quarto final project presentation
| – πŸ“„proposal.qmd # Project Problem statement and proposal
| – πŸ“„README.md # main read me file for git.