This study presents novel algorithm for surface defect detection in processor socket based on machine learning ML.. The proposed model can successfully detect various socket defects to h
INTRODUCTION
Processor socket plays an important role for computing systems to provide mechanical connection, power transition and data communication between processor (CPU) and printed circuit board (PCB) In semiconductor industry, sockets are widely used during electrical and functional testing of processor Without them, PCB cannot contact properly with CPU for power and electrical signal transmission Due to its important application in any processor facility, sockets need regular maintenance and repairing to ensure electrical their electrical characteristics With its versatile fundamental in integrated validation system, the demand for high-quality sockets has been insisting for years in semiconductor industry The electric conductivity and data integrity of socket are essential in product yield and overall manufacturing productivity
Figure 1.1 CPU socket in personal computer (PC) [1]
However, the most challenge faced during production in any factory facility is surface defect induced by various factors as environment, human or process There are variety of socket defects as contamination with foreign material; dent or burnt pin; damaged pad; scratched or stained surface; imprinting of socket by hard debris; bent pin; bridging contact pins with metal particles; etc Illustration of these mentioned defect can be shown in figure 1.3 Socket damages can result in poor electrical connectivity, reduced validation capacity, invalid testing failure that resulted in significant drop in yield Moreover, sensitive defects as conductive metal particles can contribute to short circuit, causing widespread and severe damages to processors Early detection and removal of defect factors from socket become critical during maintenance to maintain socket quality before releasing to production line
Figure 1.3 Different types of socket defects: Contaminated socket with foreign material (a), burnt socket (b), hard debris on socket pin (c), stained socket surface
(d), bent pin (e), bridging pins with metal particle (f) [3]
Traditionally, manual inspection method via using microscope has been identified as common approach to detect defect, which is time consuming, gives low efficiency and high exposure to quality risk To be qualified for socket inspection task, technician needs to be trained and certified However, training material need to be renewed and technician are required to retake training class whenever new type of defect is observed for first time This approach is too passive and not robust against unseen defects, which lead to future escapee of defected sockets Furthermore, it is very challenging to detect all the surface defects by manual inspection with microscope in manufacturing line In practice, sockets are put under microscope and scanned by technicians, then detectability and efficiency depend heavily on each person’s assessment and skill Therefore, all defected sockets cannot be captured in repairing station and caused bad performance or even electrical failure or damages to processor during production By statistical summary on historical events and experiment, manual inspection by human only achieved maximum 80% accuracy and induced to escapee socket due to under-triggering Moreover, manual inspection is not time-efficient and always varies on person-to-person ability (eye vision, awareness, agility) Especially for high-pin count sockets, human eyes are required undergoing long high focusing period, that lead to fatigue and loss of concentration at work This is highly potential to impact worker health and degraded socket inspection efficiency To overcome this problem and get rid of human mistakes, it is essential to automate socket’s defect detection
Figure 1.4 Illustration of manual inspection with microscope [4]
There were advantages existing for automation solution to be considered and deployed There were available defected samples of all known types of defects, which could be turned into valuable image dataset for computer vision solution approach In-use microscope model had capability to capture high-definition images with multiple magnification level setting thanks to both optical and digital zooming In addition, microscope could be connected to computer for image data storage purpose after each socket sample inspection and image capturing Next prospect was computational resource availability that could be utilized and made use of for image processing and heavy computation calculations of model training and evaluation Model training was the most time-consuming task, which was acknowledged to take up to several days and weeks to complete Lastly, intelligent algorithm approach has become famous for classification task in various aspects of life and industry There were many conducted researches which would be introduced and overviewed their advantages and disadvantages in chapter 2
To automate socket inspection, eliminate human dependance on classification task, and further enhance accuracy of this essential task, this study introduced novel approach for surface defect detection in processor socket based on machine learning (ML) One of advanced features, we could benefit from ML algorithm, was its capability to self-recognize unseeing defects Later, these new defect sample was labeled as “bad” and saved for model self-learning and self-improvement of model capability and performance Thanks to this, renewal of training material and re- training became unnecessary and was eliminated The hardest obstacle in this field of study is demand for huge quantity data to train the network to recognize the different defects might exist on sockets Insufficient data collection will result in inaccurate results or underfitting problems Furthermore, various types of noise factors induce during visual inspection, such as lighting, vibration, camera focal length, etc In this regard, ML, which is a distinct method that shows high accuracy and effective technique for detecting defect, which is easy to understand and deploy for using The proposed model can successfully detect various socket defects to help prevent cost of microprocessor defect in production floor
The next chapter 2 analyze and evaluate existing research works that related to the topic and study objectives In this section, we study various research, theories and methodologies and raise existing problems that the project needs to focus on researching and solving Consequently, chapter 3 cited theory and technique used in this research This section describes method, technique and data utilized to train models and the way optimize, enhance model performance Brief description of the job conducted in this research is placed in chapter 4 Then, there’s discussion based on results obtained during the research of the topic and comparison with the research results of other authors through reference documents Lastly, chapter 5 shows conclusion and recommendation for further research.
OVERVIEW
Introduction
Starting from 1980s, machine vision method was first studied in application of surface defect detection (steel) In the 1990s, researchers studied and applied vision- based automatic surface inspection system (ASIS) in manufacturing environment Like surface defect in CPU sockets, defects in steel product have unfavorable impact on the appearance as figure 2.2, which could be applied machine vision method to online detect instead of manual monitoring by human eye During last 10 years, studiers have published science papers on surface defect detection implemented machine vision Survey of statistical-based textural defect detection as automated method approach was introduced by Czimmermann et al [5] However, this study reviewed overall wide range of objects made of various materials as metals, ceramics, and textiles, but not specific to any steel product Tang et al reviewed overall of surface defect inspection method based on machine vision and introduced image acquisition system and image processing algorithms [6] Threshold method was introduced as it had simple implementation, low computational resource requirement and provided stable outcome But the difference between socket surface defects and the background was not obvious, then this method wasn’t considered for our application This method could be demonstrated as below experiment when applied threshold to our input images
Figure 2.1 Output images of threshold method
Notably, existing noise of environment affected efficiency and robustness of image segmentation by threshold method Consequently, there was lots of noises existing in good samples, which made it challenging to differentiate between true defect and noises
Edge detection method was also presented as accurate detection method but sensitive to environmental noise (Robert), or good in detail maintaining and anti- noise, but required huge amount of calculation (Kirsch) Another introduced method was clustering, which were robust and stable, but not applicable for online detection due to its complexity of calculation (mean shift) or required given number of clusters in advance (K-means)
Figure 2.2 The category of visible defects contains defects in steel products: scratch
The review of visual inspection using traditional methods gave us some observations There are numerous reported works based on edge detection, an essential part of machine vision widely used in image segmentation and object recognition [8] One of continuing research efforts was introduced with new color edge detection method based on fusion of hue component by Tao et al [9] The author proposed to calculate hue difference and apply it to classical gradient operators to accurately detect edges of hue components with low-computational complexity However, this method was heavily dependent on color and brightness, and didn’t have resistance to noise Another insight into traditional methods was that they need use a prior knowledge of defect features (color, texture, shape, etc.) This resulted in failure to detect unseeing defects as example as figure 2.3
Figure 2.3 Failure of defect pin detection due to similar white color feature
Recently, DL method for target detection has been widely researched Convolutional neural network (CNN) is usually used to realized automatically features, and then classify them to specific groups based on each application [6] Different from traditional pattern recognition methods, DL offers automatic learning capability of features from datasets while manually designed features mainly rely on knowledge and experience of designer Industrial application has also required better outcomes that led ML implemented in image processing concepts at the analytical stages [11]
In field of electronic industry, it becomes ineffective to apply traditional detection method due to wide category of PCB components in different and complex scenes The deep convolutional neural network is considered best approach with obvious advantages in [16] Jiaquan et al described a lightweight deep convolution network called LD-PCB to detect PCB type in real-time and a quick character identification model, called CR-PCB To achieved desired stage of PCB defect detection system, a dataset of PCB was established to use for training of LD-PCB and CR-PCB model This study gave better performance, overcoming weakness of other method with better accuracy and less amount of model calculation, which proves capability to meet the needs of PCB defect detection in industrial production
In field of surface defect detection, features are usually extracted by the threshold, edge and region with conventional image processing algorithm or manually set desired features [17-20] In practice, complex defects/ objects and noise of environment affect efficiency and robustness of traditional method based on feature hypothesis With the development of ML, several research projects are noteworthy to give contribution in term of make defect inspection in complex circumstances become a reality In [21], a comprehensive review of crack detection method on concrete bridge surface was present as deep bridge crack classification (DBCC)-net – a classification-based DL network The binary classification problem was converted from starting regression problem of the target detection by pruning the YOLOX Besides, author proposed using network post-processing and two-stage detection strategy to detect crack and extract crack morphology in fast manner The work result showed system ability to detect crack location and fully extracted the crack shape feature, which proves this research’s high application and reference
Results in study [22] provided multiple insights and methods for detecting and classification of pavement crack from images Performance of 10 pre-trained CNN architectures adopted various optimized techniques were explored and summarized in detail Developed models were also taken into robustness testing against images contaminated with noise and proved to give persistent performance However, this study only focused only three types of cracking, which left more types to be evaluated in further research
Another study related to surface defect detection on steel plate surface was conducted by Bo et al in [23] In this work, apart from most research applied convolutional neural network (CNN) recently, another approach was noted as DL method based on Transformer In detection system, hierarchical features were extracted by Swin Transformer, which abandoned the network structure of CNN, depend on a self-attention mechanism Evaluation was conducted on NEU-DET dataset showed 81.1% mAP achieved by proposed algorithm, best in class of any known classic CNN detection method such as faster R-CNN, SSD, Yolo v3 and RepPoints
Aero-engine blades crack detection plays critical role in daily ground maintenance, which maintain a real challenge due to its random distribution and complex characters The defect detection was enhanced by Yolov4-tiny model that was universal and concentrated on characteristics of crack [24] It was introduced with combination of attention mechanism and bicubic interpolation to enhance crack focusing and effect of multi-scale feature fusion Evaluated with varying lighting and noise images, the proposed system’s robustness was proved with average accuracy 81.6%, better than original Yolo4-tiny Furthermore, the model met real-time application need to be applied in practice as aero-engine blades crack detection equipment
In Vietnam, in early 21 st century, there was research on artificial intelligence, especially in fields of image processing and identification Many studies had practical implementation as writing letter identification, human voice, gesture, and face recognition, … [25-27] These studies presented traditional ML algorithms as support vector machine (SVM), Decision tree, Artificial neural network (ANN), … which helped form important starting point for further learnings in this new field of technology In recent years, significant advancements in computer hardware, especially Graphic Processor Unit – GPU, along with substantial investments in
Artificial Intelligence (AI), have enabled Convolution Neural Network (CNN) to effectively extract specific feature from images This provided wider application in term of object identification and classification [28-32]
In term of surface defect detection, some ML solutions are proposed by suppliers with embedded system mainly imported from foreign company No study of ML applied on this kind of field to provide knowledge and algorithm for further research and development Therefore, in this study, author researched ML to identify surface defect to provide solution with higher accuracy and effectiveness.
Study objectives
The reviewing of a significant amount of reported of works in same field of study gave us some insights into the existing challenges to overcome and possible development of this application area This research was targeted to explore capability of ML to detect 2 main kinds of surface defects of socket, which were contaminated socket with foreign material and bridging pins with metal particle Target objectives are to (1) collect data from socket inspection images, (2) conduct DL models using various combination of techniques and (3) evaluate performance of designed model with other models
Figure 2.4 Socket defects in this study: contaminated socket with foreign material
(a) and bridging pins with metal particle (b)
MATERIALS AND METHODS
Image acquisition and processing
A visual inspection system mush has camera as one of key components It helps to capture light and converts it into an electrical signal, which later can be processed by computer The digital signal represents the intensity of light at various points, forming a two-dimensional grid called a pixel array There are 2 types of cameras as CCD camera (charge coupled device) and CMOS camera (comprehensive metal oxide semiconductor) Advantages of CCD are wide dynamic range, high sensitivity and resolution, small distortion, and volume CMOS has the advantage of low cost and power consumption CCD and CMOS have gap narrowed along the time
Each photodiode turns light into electricity, which relays the electricity The amplifier located at the exit amplifies the electricity and turn it into a signal Based on arrangement of photosensitive units, there are line scanning and area scanning CCD camera High scanning frequency and continuous moving object detection could be achieved with line scanning CCD in uniform moving speed and illumination
In other way, area scanning CCD can capture 2D image information, but it faced problem of large field of view and high resolution Image taken by CCD camera is sharper with lower noise in even low-light conditions However, CCD camera tends to have slower readout speeds compared to CMOS camera, which makes it less suitable for high-speed application
Each photodiode is equipped with an amplifier, then converted electricity is amplified on the spot and turned into a signal By operating the matrix of switches, the pixel signals can be accessed directly and sequentially, and at a much higher speed than a CCD sensor CMOS sensor allows faster readout speed, that enable camera to capture fast moving object or video frames at higher rates Having an amplifier for each pixel also gives another advantage: it reduces the noise that occurs when reading the electrical signals converted from captured light Slightly more image noise is observed in comparison to CCD camera in same low-light condition
In this study, to meet high speed demand of system, CMOS camera is more suitable than CCD camera Its lower power consumption and less heat generation is ideal for continuous operation in factories and production lines In addition, illumination is uniform thanks to lighting setup, that enable CMOS camera capture images with less noise
Light source is one of key factors affecting imaging quality There are common light sources included halogen, fluorescence, xenon lamp and LED
- Halogen: high brightness, uniform brightness, and color temperature
- Fluorescence: uniform luminescence, good diffusion
- LED: long service life, lower power, fast response, uniform and stable light
In according to many advantages, LED was implemented as lighting source in this surface defect detection
A variety of sockets used in this study were in-use sockets taken directly from manufacturing line A high-definition digital microscope “EVOCAM II” was used for image capturing with CMOS-type camera spec as table 3.1 For this experiment purpose, increase optical magnification to 10x to see enough detail of contact pins of every socket After putting on socket sample, turn on microscope, begin viewing and capture high resolution images at the touch of built-in button or through software on
PC connected to microscope Then output image is saved under file type png in a PC folder
Table 3.1 EVOCAM II camera specification
Camera resolution Full-HD 1080p, 1920 x 1080, 1/2.8″ CMOS
Frame rate 50fps & 60fps (switchable)
Focus Auto focus and manual focus
Any trained DL model requires sufficient quality and quantity of dataset to avoid impact its efficacy Consequently, it is essential to arrange images in systematic way for later used in the model training and testing In this experiment, each image needed to be labeled as good or bad sample The total number of samples was 749, included
582 good samples and 167 bad samples The unequal distribution was originated by nature population of sockets in facility, which would be addressed by data augmentation to achieve same number between good and bad samples as 582 Data augmentation would only be performed after dataset splitting to avoid data leakage among training – validation – testing set
Using full dataset for training (without training/ validation split) leads to problem of model underperforming or overfitting To overcome this obstacle, we split dataset into 80% calibration, 10% validation and 10% testing Training set was used to establish, optimize model when validation set was used to validate model performance Later, testing set was used to assess performance of model By the end of this stage, the images within the dataset were designated into 6 separate categories, as seen in table 3.2
Figure 3.5 Example of CPU socket
Table 3.2 Summary of the study dataset Dataset Training Validation Testing Total
Figure 3.6 Classification of image denoising
There are two image denoising methods as spatial domain and transform domain Spatial domain consists of local filtering and nonlocal filtering Transform domain consists of Fourier transform (FT), wavelet transform (WT), Wiener filtering and multi-scale geometric analysis (MGA)
Linear filtering consists of mean filtering and Gaussian filtering The mean filtering is uncomplicated and quick, but it makes image lose some details Gaussian filtering could remove Gaussian noise in an effective way Gaussian noise is a statistical noise having the probability density function equal to that of a normal distribution Random Gaussian function is added to Image function to generate this noise Since it arises in amplifiers and detectors, it is also known as 'electronic noise'
It is caused by the discrete nature of radiation of warm objects
Figure 3.7 Example of Gaussian noise applied on socket image
Contrary to Gaussian noise, Uniform noise is a signal dependent (unless dithering is caused or applied) that follows a uniform distribution It is caused by the quantization of the pixels of an image to several discrete levels It is generally created when analog data is converted to digital form and is not often encountered in real- world imaging systems
Figure 3.8 Example of Uniform noise on socket image
Non-linear filtering consists of median filtering, mathematical morphology filtering and bilateral filtering Median filtering could suppress effectively pulse noise and salt and pepper noise, so it was used in this study to denoise socket surface defect images Impulse or "Salt and Pepper" noise is the sparse occurrence of maximum (255) and minimum (0) pixel values in an image This can be noticed as the presence of black pixels in bright regions and white pixels in dark regions This type of noise is caused due to sharp and sudden disturbances in the image signal and is mainly generated by errors in analog to digital conversion or bit transmission
Figure 3.9 Example of impulse noise (salt and pepper noise) on socket image
The basic morphological operations consist of corrosion, expansion, opening and closing, and the structural elements provide good morphological denoising effect
It can be used denoise socket images and clean small-size components of images Bilateral filtering looks upon the spatial information and grey similarity of the image and preserve the edge while removing the noise Bilateral filtering is light and non- iterative However, it will damage the texture of the image at a certain level
Nonlocal means filter is one of spatial domain methods Weighted average of the pixels in the image with similar regions obtains the current pixel value of the image The redundant information in the image is made use of by nonlocal means filter It also maintains detailed features of the image to a large extend during denoising The disadvantage of this filter is that it requires complex computation and experienced selection of parameters
Convolutional neural network and transfer learning
Human brain consists of billions of interconnected neurons, where command passes through and ends up with a specific action Key to Artificial Neural Network (ANN) is the design to enable them to process information in a similar way to biological brain A NN is a mathematically model of human brain Mathematically,
NN is nonlinearity of affine function (pass input x through affine function and applied nonlinearity)
B: bias ρ: nonlinearity function, common activation function is sigmoid function:
Figure 3.13 Sketch of an artificial neuron [39]
A neuron is composed of 3 main components: neural connections, a summation block, and an activation function
Neural connections are elements of the neural network that link neurons together, connecting the output of one neuron to the input of another in a different layer A distinctive feature of these connections is a weight assigned to each signal that passes through, which is multiplied by this weight These connection weights are the fundamental free parameters of the neural network, which can be adjusted to adapt to the surrounding environment
The summation block is used to calculate the sum of the input signals to the neuron, each having been multiplied by the corresponding connection weights The operation described here creates a linear combination
The activation function, also known as a non-linear activation function, transforms the linear combination of all input signals into an output signal This activation function ensures the non-linear properties of neural network computations
It acts as a limiting function, restricting the range of the output signal to a finite set of values
NN is a collection of neurons, which are connected in acyclic graph and structured in layers NN originates from perceptron Single perceptron is not sufficient to model in any practical application but is multilayer perceptron or MLP MLP is type of architecture expanded from single layer perceptron by adding additional layers of units known as hidden layers It consists of input layer, hidden layers, output layer Deep NN is a NN with many input units, several hidden layers of units and an output layer Number of output units depends on class we need to predict
NN architectures can be divided into two main groups: feedforward networks and backpropagation networks NN are suitable for problems such as pattern matching, classification, function approximation, optimization, vector quantization, and data space partitioning, where traditional methods may not be capable of effectively addressing these issues
DL is used to build powerful vision system to see and predict based on raw data input It is leading computation vision and sight of vision because of its ability to learn directly raw data/ raw image Traditional method applies manual feature extraction, based on prior knowledge, is non-robust against noises NN learns feature directly from images and helps build more complete set of features To feed to fully connected NN (FCN), image as 2D array is flatten to vector Then spatial relation is lost, and enormous parameters are highly inefficient To maintain spatial, patching and sliding operation is a mathematical operation called convolution
DL is a distinct branch of the field of ML, and it has become popular in the recent decade as scientists have been able to leverage the powerful computational capabilities of modern computers as well as the vast amounts of data (images, audio, text, etc.) available on the Internet The relationship between DL and ML, as well as other related fields, can be clearly seen in the diagram provided below
Figure 3.15 An illustration of the position of deep learning (DL) within the area of artificial intelligence (AI) [40]
Networks trained using deep learning methods are also known by another name: Deep Neural Networks (DNNs), due to their mode of operation Fundamentally, these networks comprise numerous different layers, each analyzing the input data from various perspectives and at increasingly higher levels of abstraction
In detail, with a deep learning network for image recognition, the initial layers of the network perform very simple tasks such as detecting straight lines, curves, or color blobs in the input image This information is then used as input for the subsequent layers, which have the more complex task of identifying the components of objects in the image from these lines and edges Finally, the highest layers in the training network take on the task of detecting objects in the image
With this method of learning information from images through many different layers and levels, these methods can help computers understand complex data through multiple layers of simple information step by step This is also why they are called DL methods
Despite their many advantages in training computers for complex problems, DL still has many limitations that prevent it from being applied to solve all issues The biggest limitation of this method is the requirement for the size of the training data;
DL models require a vast amount of input data to learn through many layers with many neurons and parameters Additionally, computing on such a large scale of data and parameters also demands the processing power of large server computers The process of data selection and model training is time-consuming and labor-intensive, making the experimentation with new parameters for the model a daunting and difficult task However, thanks to transfer learning methods, this major limitation is no longer as severe an issue as before - this will be discussed in detail in later chapters
Beyond the limitations of input data size, DL is still not intelligent enough to recognize and understand complex logic like humans; the tasks it performs are still relatively mechanical and need improvement to become "smarter."
Convolutional Neural Networks (CNNs) are a deep learning model that enables us to build intelligent systems with high accuracy as seen today They use convolution operations to extract input features, combined with non-linear activation functions like ReLU to generate more abstract information for subsequent layers This process is repeated across multiple hidden layers (using convolutional filters) to ultimately produce a set of features for object recognition
Model establishment
To apply for images as 2-dimensional data, 2D-CNN model was developed [10]
It implemented convolutional operations to extract local pattern and features of input images Each CNN layers had filters, also acknowledge as kernels, to slide across image and the area matched with filters would give high score All layers had ReLU activate function and were applied L2 weight regularization to address overfitting problems [11] Overfitting problems are often met when training high dimensional data or high complex model with large number of hidden layers, that makes the training models fit too closely to training datasets and gives bad performance on testing datasets Then during process to enhance model learning performance, this study aimed to identify the optimal balance between study objectives as model performance and computational resource
To extract the relevant features, avoid overfitting and optimize execution time, the proposed model is pre-trained FCN with ImageNet dataset, which is later finetuned with target dataset to recognize surface defect of socket Different typical CNN frameworks would be evaluated for classification task in this study, which included AlexNet, EfficientNet-B1, DenseNet201 and ResNet18 It was preferred to use lightweight network architecture, which later could improve real-time performance and support cross platform deployment
Figure 3.24 Performance of top accuracy model [36]
Surface defect detection? Yes Model of defects
Optimization with OpenVINO method
By [36], OpenVINO™ is an open-source software toolkit for optimizing and deploying DL models It helps enhance DL performance in computer vision, automatic speech recognition, generative AI, natural language, and many other common tasks Various models developed in popular frameworks are support as TensorFlow, PyTorch, ONNX, Keras, and PaddlePaddle Broad platform from edge to cloud are compatible with OpenVINO with reduced resource requirement and efficient deployment
It is known for cutting-edge compression features and advanced performance capabilities to optimize and scale AI applications First, build model in the training framework or get pretrained model from Hugging Face Second, optimize model with OpenVINO to response faster and require less memory Compression results in smaller and faster models that can be deployed across devices NNCF provides a suite of advanced algorithms for Neural Networks inference optimization in OpenVINO with minimal accuracy drop We used 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize our model.
Software
Programming language for ML development was implemented in Jupyter notebook using Python, Pytorch All software operations are performed on a Win 11 64-bit operating system, powered by Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz 3.00 GHz and equipped with 64 GB RAM.
RESULTS AND DISCUSSION
Experiment description
Image samples of good and defect socket pins are shown in figure 4.3 There were various types of defects might exist on socket surface with different dimensions, colors, shapes… Socket pins and foreign material exhibited similar appears with comparable colors in room lighting In addition, sockets were observed to have hundreds of socket pins, making it challenging to classify them manually by human vision Therefore, it is critical to develop intelligent model to recognize the good and bad sockets pin, then alert to technicians for troubleshooting and repairing actions This thesis’s experiment system was setup as follows:
• Evocam was setup on workstation with lighting from above fluorescent lamp
• Evocam and computer were connected to capture and store images
• Computer was deployed for image processing and AI model training and validation
• Monitor was mounted to display defect detection results
The principal diagram of designed system is shown in figure below:
Example of socket image was used in the experiment:
Figure 4.2 Image from system setup
Later, preprocess images by removing noise from them The Median Filter takes all the pixels under a kernel area and replaces the center element with the median value It is meant to be highly effective against Impulse noise
Table 4.1 Effect of Median filter against multiple noise
Type of noise Before Median filter
Gaussian filter employs a Gaussian Kernel in place of a box filter and is said to be effective against Gaussian Noise
Table 4.2 Effect of Gaussian filter against multiple noise
Type of noise Before Median filter
After collecting all socket samples for data collection, the standard socket inspection was performed for all sockets by human using mentioned microscope model Socket inspection followed the current guidelines documented in facility operation specification The sockets were put on microscope, then under 10x optical magnification manually classify good or bad pins Then, socket image was saved in computer drive and extracted region of interest (ROI) as individual socket pin and surrounding area as figure 4.3 These images acted as input images with size as 40x40 pixel (input size) Since, most of the pretrained models provided in torchvision (the newest version) already added self.avgpool = nn.AdaptiveAvgPool2d((size, size)) to resolve the incompatibility with input size, there is no issue with input size different from input size 224x224 used in training ImageNet Next step is labelling each extracted region of interest as “good” or “bad” by placing in 2 separated folders
Figure 4.3 Example of sample good sample (a) and bad sample (b-c)
Figure 4.4 Collection of input images
In this study, the proposed method was intercepted in Pytorch framework with OpenCV package The training batch size was set to 32, the number of iterations was set to 50, NLL loss function and Adam optimization method were used Initial learning rate is optimally set as 0.001, then later applied changing scheduler to further improve learning effectiveness The accuracy metric was defined as proportion of correct prediction of model by total samples Loss function is specified as:
Where: n is the number of data points in the data set
C is the number of classes in the classification problem yij is the binary indicator for class j if it is correct indicator of i and vice versa pij is the predicted probability that data point i belongs to class j
The activation function of classifier is logsoftmax, which is better than softmax or argmax function [17] The state-of-the-art models for classification task are also implemented for comparative experiment The models are setup with same hyperparameters and optimizer to evaluate classification performance
In this study, model performance was evaluated by common evaluation metrics as precision, accuracy, recall and F1-score Precision indicates the ratio of correct positive prediction instances to all positive prediction instances If precision is high, the model is precise to predict positive sample Accuracy is the overall correctness in term of prediction, compared correct prediction instances to total prediction instances Recall indicates the ratio of correct positive prediction instances to all actual positive samples The high recall means model has low false negative rate or it can capture positive sample effectively F1-score is the most common evaluation metric, calculated by combining precision and recall The formulas of these mentioned metrics are shown as below:
Where TP is true positives, TN is true negatives, FP is false positives, and FN is false negatives For this study, good sample is positive, and bad sample is negative.
Data augmentation and balancing
During model development, class imbalance during training and utilization of data augmentation were factors contributed to hyperparameter optimization [19] For imbalance dataset, the initial training and testing dataset had different proportion between good and bad category Good samples were dominant bad samples that might affect to model training accuracy since this imbalance makes models assume input classified mostly as good To augment imbalance dataset, 3 transformations were applied to each image, including rotation with random angle between 0 degree to 90 degree, randomly cropping and resizing and horizontal/ vertical flipping This resulted in 4 altered copies of each bad sample images to be added into input dataset, which made dataset balance between good – bad sample proportion
Balanced subset was achieved by reduced good samples in compared to original imbalanced dataset
Figure 4.6 Augmentation method example, Original image (a); Rotated image (b);
Random cropped image (c); Flipped image (d)
The experiments were conducted using 3 different subsets: imbalanced subset, balanced subset, balanced subset with implemented data augmentation The same pre-trained AlexNet had last layer trained separately using original imbalanced set, the balanced set, and an augmented imbalanced set Reference baseline was the performance of CNN trained utilizing the imbalanced set Same hyperparameter were determined for each experiment The results are captured in figure 4.7
The achieved result showed in figure 4.7 indicated that balancing dataset by dropping good samples resulted in slightly improved 1.5% accuracy in compared to imbalanced subset This small improvement was side effect of this method as it also caused input size reduced Utilizing a balanced subset with data augmentation led to
7% validation accuracy or improve overfitting problem of origin dataset 8% This highlighted the necessity of balancing and augmenting dataset that can improved classification accuracy
Figure 4.7 Validation accuracy comparison between subset
Classification results
• Build model system on Jupyter notebook using python language
• Train and test system performance for accuracy, loss and mentioned evaluation metrics
Epoch augmented_data unbalanced data balanced_data
Table 4.3 Process step for defect detection system Region of interest (ROI) extraction
ROI position is determined by socket pin map
Each product has it owns socket pin map
Artificially increase input dataset by various image transformations (random cropping, rotation, flipping…)
In each epoch, compute loss, backpropagate the gradients and update the parameters
Set model to evaluation mode, compute outputs, loss and validation accuracy
Validate model with unseen testing set Compute evaluation metrics
After formed training and testing dataset, the DL model was established to explore texture feature of input images and classification algorithms In addition, prior to building the model, we normalized the data to standardize texture features to [0,1] range, hence preventing the impact of various numerical ranges Thereby, potential biases by specific features were mitigated and let model have consistent and reliable performance on our dataset CNN models loaded from torchvision were pretrained on ImageNet with more prediction classes than our purpose, so the model was modified by replacing last fully connected layer to 2 output values (good-bad)
Figure 4.8 Load pre-trained model and optimizer
The model was trained on Jupyter notebook with collected dataset The model training was executed with 50 epochs, leaning rate = 0.001 and selected loss function and optimizer as above Applied early stopper with patience = 3, min delta = 0.001 for validation loss The goal of training a model was to encourage the reduction of validation loss or monitor trend in validation loss If the training did not result in lowering of the validation loss, then terminate it
All models were initially pre-trained on ImageNet dataset and subsequently fine-tuned on same target domain dataset It was proved AlexNet can achieve the highest training and validation accuracy Additionally, the executing time is the shortest among models in this experiment AlexNet model achieved the best classification performance, with precision, accuracy, recall and F1-score of 95%, 96.6%, 98.3% and 96.6% respectively Later, AlexNet model was chosen for further evaluation
AlexNet architecture is good at extracting fine-grained information from images, which makes it suitable for our classification application Fine-Grained Image Classification (FGIC) is a kind of image recognition where minor categories such as car models, cat breeds, bird species need to be differentiated And our application was one of FGIC we get to classify finely categorize objects as good and bad socket pins To summarize, ResNet, EfficientNet and DenseNet are favored for more complex task (more complex patterns to be learned) thanks to deeper architecture, whereas AlexNet is suitable for simpler tasks
Table 4.4 Comparison of evaluation metrics among models
Dataset Precision Accuracy Recall F1-Score Time/epoch
As figure 4.10, the training process stop at epoch 12/50 After training was completed, the maximum accuracy of model could achieve was 99.1% This performance was positive result, then started the evaluation of this system with unseen dataset as figure 4.11
Figure 4.10 The accuracy of training process
Figure 4.12 Training and validation loss
Figure 4.13 Training and validation accuracy
Developed model produced maximum performance as training accuracy 91.5% and validation accuracy 99.1% The data demonstrated the advantages of transfer learning in dealing with limited domain source data as training dataset
Experimental result with testing set
Evaluate model performance with unseen images in testing set, the accuracy was 96.6% in average by the demonstration table 4.5
Table 4.5 Prediction result with unseen images
Label True prediction Average accuracy
Figure 4.14 The true recognition with bad label
Figure 4.15 Fail recognition with bad label
Optimization results
The goal of experiment is to demonstrate how OpenVINO in post-training to optimize proposed for high-speed execution With developed model in previous sessions, we optimized it by post-training quantization using NNCF It added quantization layers into model graph, then initialized the parameters of these added layers with a subset of training dataset The model performance would improve when converted it to OpenVINO Intermediate Representation format Firstly, the optimization process started with evaluation of the original model Secondly, we transformed the original model to a quantized one Thirdly, evaluated the optimized models and compare performance of the obtained optimized model with original model Result proved significant processing time reducing 40% from 2.61 seconds to 1.52 seconds to validate testing dataset with optimized model The validation accuracy of optimized and original models was similar
To further improve model performance in term of accuracy, utilized OpenVINO for Pytorch quantization aware training Using the Neural Network Compression Framework NNCF 8-bit quantization to optimize a PyTorch model for inference with OpenVINO Toolkit Firstly, we created a quantization data loader and wrapped it by the nncf.Dataset Then, quantized the model to get optimized model At optimization step, a regular fine-tuning process was applied to further improve quantized model accuracy Quantized model demonstrated 1% further improvement of training accuracy, reduced loss under 0.1
Figure 4.16 Improvement of accuracy after optimization with OpenVINO
The performance of model against different noise
To assess robustness and resilience of model, this study used images from original testing set and applied different types of noise included motion blur, salt & pepper and Gaussian as defined as below:
- Motion blur occurs when image’s clarity and sharpness are degraded This haze often happens when camera captured image of socket during its movement
- Salt and pepper noise is type of image noise described by occurrence of black and white pixel The name of this noise is formed due to its appears as black and white particles (like salt and pepper)
- Gaussian noise, or Gaussian white noise, is a statistical noise that has a probability density function (pdf) equal to that of the normal distribution
Original Image Salt & Pepper Noise
Figure 4.17 Example of a crack image polluted with different noise sources
Evaluation metrics of 4 images categories are shown in table 4.7 The accuracy of model was mainly affected by salt & pepper image category, resulting in accuracy of 74.1% and F1-score of 65.1% The result of 48.3% recall showed that salt & pepper noise made model confused between real defect and added noise and later classify inaccurately actual good sample as bad On other hand, the impact of Gaussian and motion blur to recall wasn’t exist, but mainly lowered model’s accuracy to 79.3% and 85.3% respectively These findings demonstrated that the model was vulnerable to noise and need to be enhanced to adapt to environmental condition
Table 4.7 Performance of the proposed model against various noise sources
Noise Source Precision Accuracy Recall F1-Score
To address the instability and inaccurate classification of model in adverse environment, enhanced model by applying 3 mentioned types of noise original training set and retrain model The evaluation metrics of the network are depicted in figure 4.18
Figure 4.18 Performance of enhanced model against different noise sources
Average accuracy Original model Motion Blur Good 58/58 100%
Enhanced model Motion Blur Good 56/58 97%
To summarize, the enhanced network’s classification improved when faced with difference noises The accuracy consistently exceeds 90% This demonstrates that the proposed model has robustness and adaptability against background noise
Figure 4.19 Classification of proposed model with background noise
CONCLUSION
Conclusion
Study results were presented in each chapter of this thesis, which were evaluated through experiments Thesis completed research objectives, which had great contribution as shown as following:
(1) Researched and summarized development of computer vision and artificial intelligence During development, artificial intelligence faced many challenges and difficulties, as limited computational resources, and lacking data library However, artificial intelligence, ML, and DL played an important role in human development in modern life In chapter 2, it was cited that traditional method was first invented to solve simple and specific tasks, but it formed theory base for advanced modern algorithm in later time
(2) In chapter 3, classification method based on ML and DL was described The most powerful algorithm and model as convolutional neural network was also introduced as solution to many challenging computer vision problems and tasks With CNN introduction, ML and DL achieved next remarkable milestones, where relevant development and invention brought benefit to human life and industry
(3) To address the limited and unbalanced classes problem of surface defect of socket recognition, this study proposed a transfer learning method This proposed solution, the model can be pretrained and gained acknowledge with ImageNet from other field and later finetuned with target domain dataset as actual socket images Unbalanced proportion of original dataset was also addressed by augmentation method included difference transformations The proposed method accuracy can achieve 96.6% testing accuracy Environmental factors were considered during experiments, which might greatly affect to accuracy
(4) OpenVINO tool kit boost further performance of proposed model to 1% improvement of training accuracy, reduce 0.1 loss To detect socket surface defects, in this study, 4 common pre-trained CNN model were used, finetuned and validated on the collected dataset Evaluation metrics as precision, accuracy, recall, and F1- score were applied to assess performance of every models The best performance model was developed model, with best F1-score 96.6% The result of this study promotes continuous development of using DL to further enhance surface defect detection
(5) However, besides remarkable achievement listed above, there were still work requirement to address existing limitation of this study
- This work focused only classify good and defect sockets, but not specifically any detail of surface defects Image in dataset were still limited, that resulted in limited model performance and accuracy
- Hyperparameters were pre-defined and yet to be proved as optimized values to achieve best model performance.
Future development
Developed model was proved to be efficient solution, but there was still potential for further development and research in future
- More specific types of defects (burn, scratch, stain, and contamination) will be studied to train and evaluate the model for detection
- Develop system to have adaptive learning, which give model ability to learn and recognize defects of new sockets It also helps enhance model capability to recognize successfully new defect type in future encountering
- Further study to examine the effects of various noise existing in facility environment should be implemented to verify model’s robustness and flexibility
- Assess and find the optimized value of hyperparameters to replace fixed value during model training to achieve better performance for classification tasks
- Finally, future research should aim to enhance the performance of pre-trained CNN model to achieve better accuracy and lowest false alarm
The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper
[1] P Geng "Structural Design of LGA Loading Mechanisms for Intel CPU Stack Retention." Internet: https://www.researchgate.net/figure/The-first-Intel-LGA- and-socket-stack-a-LGA-package-and-socket-stack-b-LGA-
[2] J West "Semiconductor test sockets – A billion dollars of pin perfection." Internet: https://www.linkedin.com/pulse/semiconductor-test-sockets-billion- dollars-pin-perfection-john-west/, May 30, 2024
[3] J Shields "Socket 1156 burning issues…is the fire out?" Internet: https://www.overclockers.com/socket-1156-burning-issuesis-fire/, May 30,
[4] PCBWay "Introduction of MVI & ICT & X-Ray Inspection" Internet: https://www.pcbway.com/blog/Engineering_Technical/Introduction_of_MVI_ ICT _X_Ray_Inspection.html, May 30, 2024
[5] T Czimmermann et al., “Visual-based defect detection and classification approaches for industrial applications – a survey,” Sensors 20, vol 5, pp 1–25,
[6] B Tang et al., “Review of surface defect detection of steel products based on machine vision,” IET Image Process, vol 17, pp 303–322, 2023
[7] X Xie “A review of recent advances in surface defect detection using texture analysis techniques,” Elcvia Electron Lett on CV & IA 7, vol 3, pp 1–22, 2018 [8] V Appia and A Yezzi “Active geodesics: region-based active contour segmentation with a global edge-based constraint,” in Proc IEEE Conf on Computer Vision, vol ICCV’11, pp 1975–1980, 2011
[9] L Tao et al., “Colour edge detection based on the fusion of hue component and principal component analysis,” IET Image Process, vol 8, no 1, pp 44–55,
[10] N Kazy “Detection, quantification and classification of ripened tomatoes: a comparative analysis of image processing and machine learning,” IET Image Process, vol 14, no 11, pp 2442-2456, 2020
[11] D Karimi et al., “Effective supervised multiple-feature learning for fused radar and optical data classification,” IET Radar Sonar Navig, vol 11, no.5, pp 768–
[12] O Abass and N Golshah “Review of the application of machine learning to the automatic semantic annotation of images,” IET Image Process, vol 13, iss 8, pp 1232-1245, 2019
[13] J Johson et al., “Image retrieval using scene graphs,” presented at the IEEE
Conf on Computer Vision and Pattern Recognition, Boston, 2015
[14] Z Wang et al., “Detection of Insect-Damaged Maize Seed Using Hyperspectral Imaging and Hybrid 1D-CNN-BiLSTM Model,” Infrared Physics & Technology, vol 137, no 105208, 2024
[15] D Passos and P Mishra “Deep Tutti Fruitti: Exploring CNN architectures for dry matter prediction in fruit from multi-fruit near-infrared spectra,”
Chemometrics and Intelligent Laboratory Systems, vol 243, no 105023, 2023
[16] S Jiaquan et al., “Defect detection of printed circuit board based on lightweight deep convolution network,” IET Image Process, vol 14, iss 15, pp 3932-3940,
[17] B Song and N Wei “Statistics properties of asphalt pavement images for cracks detection,” J Inf Comput Sci.10, vol 9, pp 2833–2843, 2013
[18] D Kang et al., “Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning,” Autom Constr., vol 118, no 103291, 2020
[19] S Unar et al., “Visual and textual information fusion using Kernel method for content based image retrieval,” Information Fusion, vol 44, pp 176–187, 2018 [20] X Wang and Z Wang “A novel method for image retrieval based on structure elements’ descriptor,” J Visual Commun Image Represent.24, vol 1, pp 63–
[21] J Kun et al., “A deep learning-based method for pixel-level crack detection on concrete bridges,” IET Image Process, vol.16, pp 2609–2622, 2022
[22] S Matarneh et al., “Evaluation and optimization of pre-trained CNN models for asphalt pavement crack detection and classification,” Automation in Construction, vol 160, no 105297, 2024
[23] B Tang et al., “An end-to-end steel surface defect detection approach via Swin transformer,” IET Image Process, vol 17, pp 1334–1345, 2023
[24] T Hui et al., “Detail texture detection based on Yolov4-tiny combined with attention mechanism and bicubic interpolation,” IET Image Process, vol 15, pp 2736–2748, 2021
[25] K Hoang et al., “Application of artificial neural network in automatic form processing,” in Conference proceeding of 25th anniversary of Communication technology Institute, 2001, pp.1-3
[26] D D Nguyen and M T Nguyen “Identification and application in writing letter identification,” B.A thesis, Hanoi National University, Ha Noi, 2023
[27] Q C Bui “Application of artificial neural network in letter identification,” B.A project, Hai Phong University, Hai Phong, 2007
[28] B L Nguyen “Building a student attendance system based on face recognition,” B.A thesis, Da Lat University, Lam Dong, 2022
[29] T A Nguyen “Application of deep learning in sentiment analysis,” M.A thesis, Posts and Telecommunications Institute of Technology, Ho Chi Minh City,
[30] D P Tran “Adaptive learning solutions based on deep learning networks using object recognition,” Ph D dissertation, Duy Tan University, Da Nang, 2020 [31] H O Pham et al., “Combining Deep learning features in supporting automatic diagnosis of eye diseases,” in Proceedings of the XIII National Science and Technology Conference, Nha Trang, 2020, pp.198-205
[32] V P Nguyen “Application of deep learning in fruit classification,” M.A thesis,
Ho Chi Minh City University of Foreign Languages and Informatics, Ho Chi Minh City, 2019
[33] S Zhang et al., “An FCN-based transfer-learning method for spatial infrared moving-target recognition,” Infrared Physics & Technology, vol 137, no
[34] A Krizhevsky et al., “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems, vol 25, 2012 [35] Intel “Intel® Distribution of OpenVINO™ Toolkit.” Internet: https://www.intel.com/content/www/us/en/developer/tools/openvino- toolkit/overview.html, May 25, 2024
[36] L M Kyriazi " EfficientNet-PyTorch." Internet: https://github.com/lukemelas/EfficientNet-PyTorch, May 30, 2024
[37] R Hui “Other types of photodetectors,” in Introduction to Fiber-Optic Communications, A C Garcia, Ed New York: Elsevier, 2019, pp 149
[38] M Bigasa et al., “Review of CMOS image sensors,” Microelectronics Journal, vol.37, pp 433-451, 2006
[39] F Ghorbani et al., “A deep learning approach for inverse design of the metasurface for dual‑polarized waves,” Applied Physics A, vol.127, pp 869-
[40] I H Sarker “AI‑Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems,” SN Computer Science, vol.3, pp 158-178, 2022
[41] Z Zhu “U-Net with Asymmetric Convolution Blocks for Road TrafficNoise Attenuation in Seismic Data,” Applied Sciences, vol.13, pp 4751-4766, 2023 [42] H Yingge “Deep Neural Networks on Chip - A Survey,” presented at the IEEE International Conference on Big Data and Smart Computing, Korea, 2020 [43] A Sreenivas et al., “Indian Sign Language Communicator Using Convolutional Neural Network,” International Journal of Advanced Science and Technology, vol.29, no.3, pp 11015-11031, 2020
[44] A Krizhevsky et al., “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems, vol.25, pp 5,
[45] A T Gorji1 et al., “Combining deep learning and fluorescence imaging to automatically identify fecal contamination on meat carcasses,” Scientific Reports, vol.12, pp 2392-2404, 2022
[46] F Ramzan et al., “A Deep Learning Approach for Automated Diagnosis and
Multi-Class Classification of Alzheimer’s Disease Stages Using Resting-State fMRI and Residual Neural Networks,” Journal of Medical Systems, vol.44, pp
[47] R Mahum et al., “A novel framework for potato leaf disease detection using an efficient deep learning model,” Human and Ecological Risk Assessment, vol.29, pp 1-24, 2022
1 ROI extraction import sys import cv2 as cv import numpy as np import os import shutil shutil.rmtree('temp') os.mkdir('temp') path = r"data_adl/" socket_id = '18' src0 = cv.imread(path+str(socket_id)+'.jpg', cv.IMREAD_COLOR) src = cv.imread(path+str(socket_id)+'.jpg', cv.IMREAD_COLOR) src1 = cv.imread(path+str(socket_id)+'.jpg', cv.IMREAD_COLOR) def area_range(): area_list = [] min_area = 99999.0 max_area = 0.0 gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY) gray = cv.GaussianBlur(gray, (31, 31), 3) rows = gray.shape[0] circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows/100, param10, param2 , minRadius=1, maxRadius@) index = 1 if circles is not None: circles = np.uint16(np.around(circles)) for i in circles[0, :]: center = (i[0], i[1])
# circle center cv.circle(src, center, 1, (0, 100, 100), 3)
# circle outline radius = i[2] cv.circle(src, center, radius, (255, 0, 255), 3)
# color = (255, 0, 0) k = 1.5 x1 = int(i[0]-radius*k) y1 = int(i[1]-radius*k) x2 = int(i[0]+radius*k) y2 = int(i[1]+radius*k) cv.rectangle(src1, (x1, y1), (x2, y2), (255, 0, 0) , 3) cv.putText(src1,str(index),(x1,y1),fontFace=cv.FONT_HERSHEY_COMPLEX,fon tScale=0.3,color=(0,0,0)) area = 3.14159 * i[2] * i[2] if area < min_area: min_area = area if area > max_area: max_area = area area_list.append(area) tiles = src0[y1:y2, x1:x2]
# print(tiles.shape) if (tiles.shape[0] > 0) & (tiles.shape[1] > 0): cv.imwrite("temp\\" + socket_id + "_" + str(index) + ".jpg",tiles) index += 1 cv.namedWindow("rec circles", cv.WINDOW_NORMAL) cv.imshow("gray circles", src0) cv.imshow("detected circles", src) cv.imshow("rec circles", src1) cv.waitKey(0) cv.destroyAllWindows() return min_area, max_area, area_list, circles mina, maxa, area_list, circles = area_range()
2 Denoise import cv2 import numpy as np import os import shutil shutil.rmtree('temp') os.mkdir('temp')
# Specify the folder with the images to be augmented folder_path "C://Users//cnn_ob//socket_vi//classify//data_new2//val//good"#"data_aug\bad" save_path = "C://Users//cnn_ob//socket_vi//classify//data_new2//val2//good"
# Perform augmentation for each image in the folder for filename in os.listdir(folder_path): if filename.endswith(".jpg"):
# Load image img_path = os.path.join(folder_path, filename) img = cv2.imread(img_path)
# Apply augmentation techniques to create new images denoised_img=cv2.medianBlur(img,3) denoised_img=cv2.GaussianBlur(denoised_img,(3,3),0)
# Save new images new_filename_1 = filename new_img_path_1 = os.path.join(save_path, new_filename_1) cv2.imwrite(new_img_path_1, denoised_img)
3 Enhance import cv2 import numpy as np import os import shutil def clahe(img): image0 = cv2.imread(img) image = cv2.imread(img)
# convert image from RGB to HSV img_hsv = cv2.cvtColor(image, cv2.COLOR_RGB2HSV) # Histogram equalisation on the V-channel clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) img_hsv[:, :, 2] = clahe.apply(img_hsv[:, :, 2])
# convert image back from HSV to RGB image = cv2.cvtColor(img_hsv, cv2.COLOR_HSV2RGB) return image if os.path.isdir('temp'): shutil.rmtree('temp') os.mkdir('temp') else: os.mkdir('temp')
# Specify the folder with the images to be augmented folder_path = "C:\\Users\\thanhho1\\cnn_ob\\socket_vi\\add denoise\\temp"#"data_aug\bad" save_path = "temp"
# Perform augmentation for each image in the folder for filename in os.listdir(folder_path): if filename.endswith(".jpg"):
# Load image img_path = os.path.join(folder_path, filename) img = cv2.imread(img_path) enhanced_img=clahe(img_path)
# Save new images new_filename_1 = filename new_img_path_1 = os.path.join(save_path, new_filename_1) cv2.imwrite(new_img_path_1, enhanced_img)
4 Image augmentation import cv2 import numpy as np import os def get_random_crop(image, crop_height, crop_width): max_x = image.shape[1] - crop_width max_y = image.shape[0] - crop_height x = np.random.randint(0, max_x) y = np.random.randint(0, max_y) crop = image[y: y + crop_height, x: x + crop_width] return crop
# Specify the folder with the images to be augmented folder_path = "image"#"data_aug\bad" save_path = "image"
# Perform augmentation for each image in the folder for filename in os.listdir(folder_path): if filename.endswith(".jpg") or filename.endswith(".png"): print(filename)
# Load image img_path = os.path.join(folder_path, filename) img = cv2.imread(img_path)
# Apply augmentation techniques to create new images flipped_img = cv2.flip(img, 1) rotated_img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE) c_rotated_img = cv2.rotate(img, cv2.ROTATE_90_COUNTERCLOCKWISE) cropped_img = get_random_crop(img, 30, 30)