INTRODUCTION
Introduction
The increasing demand for metal products, particularly iron and steel, in everyday life and the construction sector underscores the significance of the metal manufacturing industry As reported by the World Steel Association, Vietnam's steel market ranked as the seventh largest in Asia by the end of 2011, reflecting a growth rate aligned with economic expansion With rising incomes and a continuous trend in construction, there remains substantial potential for further growth in this industry.
According to the Viet Nam Chamber of Commerce and Industry (VCCI), small and medium-sized enterprises (SMEs) comprise 97% of businesses in Vietnam, employing over half of the domestic workforce and contributing more than 40% to the GDP These firms play a crucial role in driving economic growth; however, they currently face significant challenges, including outdated technology and a reliance on imported materials To sustain and enhance the benefits derived from the metal manufacturing sector, it is essential to analyze the technical inefficiency levels of Vietnamese SMEs.
Technical efficiency measures how effectively a firm utilizes its inputs to generate outputs, with the production frontier representing the maximum possible output from a specific input combination Firms operating on this frontier are deemed technically efficient, while those significantly below it are considered technically inefficient Analyzing technical efficiency typically involves creating a production-possibility boundary and assessing the distance of firms from this boundary to determine their inefficiency levels.
Technical efficiency can be measured using two main approaches: deterministic and stochastic The deterministic method, known as Data Envelopment Analysis (DEA), was introduced by Charnes, Cooper, and Rhodes in 1978 DEA utilizes linear programming to analyze input and output data, constructing a frontier without needing to specify a production function However, this method assumes that data is free from statistical noise In contrast, the stochastic approach, referred to as Stochastic Frontier Analysis (SFA), was first discussed by Aigner, Lovell, and Schmidt in 1977, as well as Meeusen and Broeck in the same year Unlike DEA, SFA requires a specific functional form for the production process.
Page | 6 production function and allows data to have noises SFA is used more often in practice because for many cases, the noiseless assumption are unrealistic.
Since its first appearance in Aigner et al (1977) and Meeusen and Broeck (1977), the literature of technical efficiency has been widely developed through many studies such as andPitt Lee
(1981), Schmidt and Sickles (1984), Battese and Coelli (1988, 1992, 1995), Cornwell, Schmidt, and
Sickles (1990), Kumbhakar (1990), Lee and Schmidt (1993) and Greene (2005) (see Greene
The method introduced by Battese and Corra (1977) has gained popularity for analyzing the performance of production units, including firms, regions, and countries Its versatility in handling various production processes makes it a valuable tool for performance evaluation in different contexts.
(1984), Bravo- Ureta and Rieger (1991), Battese (1992), Dong and Putterman (1997), Anderson, Fish, Xia, and Michello (1999) and Cullinane, Wang, Song, and Ji (2006).
Despite the extensive literature on estimating technical efficiency, researchers often struggle to select the most suitable model Early models focused on cross-sectional data, necessitating assumptions about technical inefficiency distribution and its independence from other model components Critics like Pitt and Lee (1981) and Schmidt and Sickles (1984) argued that such models could not consistently estimate technical inefficiency, leading to the development of panel data models that initially assumed time-invariant inefficiency However, this assumption was deemed too restrictive, prompting the introduction of models allowing for time variation in inefficiency Battese and Coelli (1995) advanced this further by creating a model that accommodates varying technical inefficiency influenced by time and other factors Greene (2005) introduced fixed and random models that enable unrestricted temporal changes in inefficiency, distinguishing it from firm-specific factors.
This thesis evaluates the technical efficiency of Vietnamese metal manufacturing firms using panel-data stochastic frontier models It also reviews various panel data models for analyzing technical inefficiency and discusses implications for model selection in this area The research utilizes an unbalanced panel dataset from the metal manufacturing industry for the year 2005.
2007 and 2009 which is withdrawn from Vietnamese SMEs survey The result shows different technical efficiency levels among those stochastic frontier models.
Research objectives
- To give a review of panel-data stochastic frontier models;
- To apply those models to investigate the technical efficiency of SME firms in metal manufacturing industry in Viet Nam.
LITERATURE REVIEW
Efficiency measurement
The primary economic function of a business involves transforming inputs into outputs, highlighting its production capabilities The output-to-input ratio serves as a measure of a firm's productivity, indicating its operational efficiency (Coelli, Rao, O'Donnell, & Battese, 2005) Consequently, changes in productivity are indicative of a production unit's effectiveness, making productivity growth a widely recognized proxy for assessing firm performance.
The terms productivity and efficiency need to be discriminated in the context of firm production.
Productivity encompasses all elements that influence the output generated from specific input levels, often referred to as Total Factor Productivity (TFP) In contrast, efficiency pertains to the production frontier, which represents the maximum output achievable with a set amount of input A firm is deemed technically efficient when it operates on this frontier, while any production below it indicates inefficiency, with greater distances from the frontier signifying increased inefficiency Variations in productivity can arise from changes in efficiency, adjustments in input levels and proportions, advancements in technology, or a combination of these factors (Coelli et al., 2005).
Efficiency measurement can be evaluated through input-oriented and output-oriented approaches Input-oriented measures focus on minimizing the inputs required to achieve a specific level of output, while output-oriented measures aim to maximize the output produced from a given set of inputs Figures 2-1 and 2-2 illustrate these concepts, with Figure 2-1 showcasing a firm utilizing inputs X1 and X2 The isoquant YY’ represents the minimum input combinations necessary for a specific output, indicating that a firm operating on this frontier achieves technical efficiency through optimal input usage The iso-cost line CC’ illustrates the ideal input ratio for minimizing costs, with technical efficiency (TE) calculated as the ratio of OR to OP, and allocative efficiency (AE) determined by the ratio of OS to OR The overall efficiency is derived from the product of AE and TE.
Page | 9 expresses the overall efficiency of the firm, called economic efficiency (EE) (i.e.�� = � ×
Figure 2-2 demonstrates a scenario where a firm utilizes a single input to generate one output The f(X) curve represents the maximum achievable output for each level of input X, forming the production frontier A firm is considered technically efficient when it operates along this frontier, with technical efficiency quantified as the ratio of BD to DE.
Numerous studies have measured and analyzed Technical Efficiency (TE) using two primary methodologies: Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) The following section provides a concise overview of these two approaches.
Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA)
Data Envelopment Analysis (DEA) is a non-parametric approach for estimating firm efficiency, initially proposed by Charnes, Cooper, and Rhodes in 1978, focusing on constant returns to scale This method was subsequently enhanced by Banker, Charnes, and Cooper in 1984 to accommodate decreasing and variable returns to scale, broadening its applicability in efficiency measurement.
Page | 10 instruction can be found in Banker et al (1984), Charnes et al (1978), Fare, Grosskopf, and Lovell (1994), Fọre, Grosskopf, and Lovell (1985) and Ray (2004)
In a scenario involving n firms, referred to as Decision Making Units (DMUs), each firm utilizes m different types of inputs to generate s distinct types of outputs The Data Envelopment Analysis (DEA) model, which focuses on an output-oriented approach, is formulated as follows: maximize h0.
In the context of data envelopment analysis (DEA), let \( x_{ij} \) and \( y_{rj} \) represent the ith input and rth output of the jth decision-making unit (DMU), respectively, where \( i = 1, 2, \ldots, m \) and \( r = 1, 2, \ldots, s \) The weights of outputs and inputs, denoted as \( w_r \) and \( v_i \), are derived from the solution to the maximization problem proposed by Charnes et al (1978) This approach employs a piece-wise frontier as introduced by Farrell, ensuring a comprehensive analysis of efficiency across multiple DMUs.
In 1957, a linear programming algorithm was introduced for maximization in mathematics, which constructs a production frontier This method evaluates the ratio of outputs to inputs and compares it with the production frontier to determine the efficiency level of each firm.
Since its emergence in 1978, Data Envelopment Analysis (DEA) has gained significant traction as a method for efficiency analysis across various industries in both the private and public sectors Wei (2001) highlights five key developments in DEA research, reflecting its growing importance The advancement of numerical methods and computer programs has enhanced the quality and quantity of DEA studies Additionally, new DEA models, including the additive model, log-type DEA model, and stochastic DEA model, have been introduced The economic and management aspects of DEA have been examined in greater depth, further solidifying its application foundation Furthermore, mathematicians have contributed to the promotion of mathematical theories related to DEA, fostering both theoretical advancements and practical applications of this non-parametric approach.
Aigner et al (1977) and Meeusen and Broeck (1977) suggested the method of production stochastic frontier to measure firms’ efficiency The model can be described mathematically as below:
In the analysis of firm output, denoted as \(Y_i\), the input vector \(X_i\) and parameter vector \(\beta\) are essential for estimation The model incorporates two error terms: \(V_i\), representing random statistical noise with a normal distribution and zero mean, and \(U_i\), a non-negative inefficiency term that reflects how far a firm is from optimal production Various distribution assumptions for \(U_i\) include half-normal (Aigner et al., 1977), exponential (Meeusen & Broeck, 1977), gamma (Greene, 1990), or a non-negative truncation of \(N(\mu, \sigma^2)\) (Battese & Coelli, 1988, 1992, 1995) Additionally, a trade-off exists between Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) in evaluating firm efficiency.
DEA and SFA, despite their different approaches, each offer unique advantages and disadvantages that necessitate trade-offs when researchers choose between them DEA, as a non-parametric method, operates deterministically without specifying a production function, while SFA employs a stochastic, econometric framework that requires model specification This fundamental difference makes DEA non-statistical and assumes data is free from noise, which can be problematic in real-world scenarios where measurement errors and random factors exist In contrast, SFA accommodates statistical noise, providing greater flexibility in handling real-world data, albeit with the necessity of making assumptions about model specifications and error distributions Furthermore, DEA views any deviation from the efficiency frontier as inefficiency, whereas SFA distinguishes between noise and inefficiency, resulting in typically higher efficiency estimates from SFA.
Data Envelopment Analysis (DEA) is advantageous for its applicability in complex production conditions, as it simplifies the relationship between inputs and outputs without requiring a specific production function However, DEA lacks statistical properties, making it impossible to test its goodness of fit or model specification In contrast, Stochastic Frontier Analysis (SFA) offers econometric tools to assess model suitability, with its primary strength being the ability to handle statistical noise Therefore, in industries with tightly controlled production processes, DEA is often the preferred method.
Measuring efficiency in industries with stable production processes, such as metal manufacturing, benefits from minimizing random fluctuations, allowing for precise determination of outputs based on given inputs In contrast, Stochastic Frontier Analysis (SFA) is more suitable for industries where noise and random fluctuations are unavoidable, as firms must navigate the impacts of variable market conditions and policy changes Given the characteristics of the analyzed industries, SFA emerges as the preferred model The following section will delve into the SFA method, exploring both cross-sectional and panel data models.
The cross-sectional Stochastic Frontier Model
The cross-sectional stochastic frontier model in Aigner et al (1977) can be describe as:
The equation can be expressed as \( Y = f(X, \beta) + \epsilon - u \), where \( \epsilon \) represents random noise and \( u \) indicates technical inefficiency (with \( u \geq 0 \)) To differentiate between these two residual components, certain assumptions must be made The primary assumption pertains to the distribution of \( \epsilon \), which is assumed to follow a symmetric normal distribution.
The distance from the frontier, represented by a non-negative value, indicates how firms operate below optimal efficiency Various statistical distributions have been proposed to model this distance, including the half-normal distribution (Aigner et al., 1977), exponential distribution (Meeusen & Broeck, 1977), gamma distribution (Greene, 1990), and a non-negative truncation of the normal distribution (Battese & Coelli, 1988, 1992, 1995).
Econometric estimation methods, specifically Ordinary Least Squares (OLS) and Maximum Likelihood (ML), are effective for calculating technical inefficiency However, the error term consists of two components: one with an asymmetric distribution and the other with a symmetric distribution, resulting in a non-normal distribution for the overall error term Consequently, the relationship between the components must be carefully considered in the analysis.
This makes the intercept in OLS bias downward To clarify, we consider a regression with just the intercept �: � = � + �, then the estimator for � will be �̅ From the equation above,
����(�̅) � + �, which does not equal to � Winsten (1957) suggested an method called Corrected Ordinary
Least Squares (COLS), Afriat (1972) and Richmond (1974) offered the method of Modified
To address the bias issue in Ordinary Least Squares (OLS), Modified Ordinary Least Squares (MOLS) and Corrected Ordinary Least Squares (COLS) methods are employed These techniques adjust the intercept by adding either the maximum or average value of OLS residuals However, both MOLS and COLS face challenges, including the potential for non-statistical significance in their estimates, as noted by Mastromarco.
2007) ML, however, has some asymptotic properties and is able to deal with asymmetrically distributed residual, is used frequently than OLS.
With technical inefficiency (�) following a half-normal distribution i.e �~� + (0, � 2 )
1977) the log-likelihood function is:
� � and � = � � ⁄� � If � = 0, the firm is fully efficient If � = 1, it is totally inefficient In the equation above, � is a vector of logarithm of outputs; � � = � � − � � = ln′
The cumulative distribution function (cdf) Φ(x) of a random variable following the N(0,1) distribution can be calculated using an iterative optimization method, as detailed by Judge et al (1982) and referenced by Coelli et al (2005) Additionally, the log-likelihood function for the exponential distribution is defined within this context.
The case of truncated normal distribution i.e �~� + (� �, � 2 ) has the log-likelihood function:
Log likelihood function for Stochastic Frontier Model with gamma distribution of u can be found
� 2 � � � with Θ and P are two parameters of a gamma distribution.
Figure 2 – 3: Distribution of technical inefficiency
Figure 2-3 displays the probability density functions of four distribution types, highlighting the limitations of the gamma, exponential, and half-normal distributions These distributions indicate that most observations cluster around low values of u, suggesting a high level of firm efficiency and low technical inefficiency However, this assumption may not hold true across various industries where high efficiency is unrealistic In contrast, the truncated normal distribution offers greater flexibility by accommodating inefficiencies closer to zero, making it a more accurate representation of u.
The Cobb-Douglas production function is represented as ln Y = α ln L + β K - ε, where the technical inefficiency of a firm is determined by the ratio of observed output (Y) to the maximum feasible output (Y*), indicating the output level achieved when the firm operates at full efficiency, or when ε equals zero.
With � � follows the �(� �, � 2 ) distribution in Battese and Coelli (1995), the parameter forinefficiency (or, in other words, also efficiency) can be analyzed with some determinants with the
� � = � 0 + � � � + � � (2.3.7) With � � is the vector of determinants of � � and � is the vector of parameters that need to be estimated The distribution of � � is the truncation of the normal distribution �(0, � 2 )
Coelli, 1995) This is called Technical Inefficiency Model and can be estimated simultaneously with the Stochastic Frontier.
Stochastic frontier model with panel data
When utilizing the stochastic frontier model with cross-sectional data, three key issues emerge, as noted by Schmidt and Sickles (1984) The primary concern is the inconsistency in estimating technical inefficiency Many studies adopt the approach proposed by Jondrow et al (1982) to assess the technical inefficiency levels of individual firms within the sample.
(2.4.1) where � and � are the standard normal density and cumulative density function respectively,
Due to the unknown parameters ∗ and ∗, their estimators ̂ ∗ and ̂ ∗ are utilized, resulting in potential sampling bias While it is essential to consider this bias, addressing it can be quite complex Fortunately, this bias diminishes asymptotically and can often be overlooked with larger samples However, it is important to note that technical inefficiency remains unaffected by sample size, leading to inconsistent estimations of the level of technical inefficiency (Schmidt and Sickles, 1984) Additionally, there exists ambiguity regarding the distribution of the data.
To ensure independence between technical inefficiency and statistical noise, a strong distribution assumption is essential for accurately decomposing the overall error term However, testing the robustness of this assumption is challenging with cross-sectional data Additionally, the assumption that the error term is uncorrelated with other regressors leads to endogeneity issues, resulting in biases within the model Schmidt and Sickles (1984) argue that endogeneity is inevitable, as firms eventually recognize their inefficiency levels and adjust their input usage to enhance efficiency.
Panel data models, which utilize data from N firms over T periods, address three key weaknesses identified by Greene (2008) Firstly, having more observations over time enhances the consistency of estimating technical inefficiency, particularly as the time series data approaches infinity Secondly, by treating technical inefficiency as a fixed effect, these models become distribution-free, making the distribution assumption optional Lastly, the assumption of uncorrelatedness is relaxed, as certain panel models can account for correlation effects The following section will provide a detailed overview of the development of panel data stochastic frontier models since their inception.
4.1Time-invariant models a Within estimation with fixed effects and GLS estimation with random effects from
Schmidt and Sickles (1984) advocate for the application of panel data to assess technical inefficiency that remains constant over time, utilizing both fixed and random effects Their model is represented by the equation: ln Yit = α + β'Xit + uit - vit.
Schmidt and Sickles (1984) employ a log-linear function, utilizing a within estimator that incorporates dummy variables to derive distinct intercepts for each firm, reflecting their individual technical inefficiencies This approach has the advantage of not requiring assumptions regarding the correlation between variables or the distribution of inefficiencies By comparing each firm's performance against the most efficient firm in the sample, inefficiency is quantified as the difference between the maximum efficiency and the firm's estimated efficiency The authors advocate for a substantial number of firms to achieve precise estimates of the most efficient firm, ideally over extended time periods Notably, this method functions as a fixed effects estimation using panel data, which captures time-invariant but firm-specific factors—such as capital stock—that may influence the firm's intercept but should not be classified as inefficiency.
The authors propose addressing the limitations of the within estimator by assuming that the variables are uncorrelated, which allows for Generalized Least Squares (GLS) estimation to provide improved estimates This approach enhances the analysis by effectively separating time-invariant regressors, a capability that the within estimator lacks However, it is important to note that this method relies on a stringent assumption that must be carefully considered.
The authors propose two alternative methods to address the issues of uncorrelatedness and distribution assumptions The first method is the Hausman and Taylor estimation (1981), which relaxes the assumption of uncorrelatedness The second method is maximum likelihood estimation, which is more sophisticated and relies on a specific distribution of the data.
The two models discussed represent some of the simplest approaches to understanding technical efficiency Despite criticisms regarding their inconsistent estimates of technical inefficiency, they advocate for the use of fixed and random effects models, which provide more reliable estimates when the time dimension (T) is large relative to the number of firms (N) However, the absence of a defined distribution complicates the accurate estimation of true inefficiency, separate from firm-specific factors Pitt and Lee (1981) introduced a half-normal distribution with maximum likelihood estimation, which is elaborated in the following section, focusing on a model that assumes time-invariant efficiency.
This study analyzes panel data from the Indonesian weaving industry to assess the level of technical inefficiency and its underlying sources The research tests the hypothesis of technical inefficiency being either time-invariant or time-varying through three distinct models The authors propose three scenarios, with the first case indicating that inefficiency remains constant over time but varies among individuals, represented by a fixed index.
*Note: Pitt and Lee (1981) use a linear function
In the second case, technical inefficiency is independent of time and among individuals, which leads back to the cross-sectional model as in Aigner et al (1977) That is:
With �(� �� � �� ′ ) = 0 and �(� �� � �� ′ ) = 0 for all � ≠ � and � ≠ �′.
The final case is the intermediate of these two when the technical inefficiency is assumed to be correlated with time That is:
With �(� �� � �� ′ ) ≠ 0 and �(� �� � �� ′ ) = 0 for all � ≠ � and � ≠ �′.
The first two models are estimated using the maximum likelihood method, while the intermediate model employs generalized least squares due to the intractability of the maximum likelihood procedure for this case A comparison between the first two models and the third model is performed using a chi-squared test, as detailed in Joreskog and Goldberger (1972).
The test indicates that the latest model is suitable, suggesting that technical inefficiency varies over time Although the paper does not specify the measure of technical inefficiency for each firm, it can be calculated using the method proposed by Jondrow et al (1982), which deduces the value of each firm's inefficiency from their respective outputs.
While the latest model demonstrates greater precision, it fails to address the distribution of technical inefficiency and lacks a measure for it Pitt and Lee (1981) propose a model featuring time-invariant inefficiency with a half-normal distribution, advocating for further exploration into time-varying inefficiency However, the assumption of a half-normal distribution can be considered unreasonable in certain contexts.
Coelli (1988) introduces a broader framework for distribution by proposing the truncated normal distribution This model is thoroughly examined in the subsequent section, highlighting its significance in the context of Battese and Coelli's research.
Battese and Coelli (1988) introduce a model where technical inefficiency is represented by a truncated normal distribution, building on Stevenson’s (1980) framework for estimating stochastic production frontiers Given the limited data over three years, the authors assume that inefficiency remains constant over time This new distribution, denoted as � + (�, � 2 ), offers a more generalized approach compared to previous models.
(half normal, which is introduced in Pitt and Lee (1981) and Schmidt and Sickles (1984)) because when
� = 0, the distribution becomes half normal With development in calculating the likelihood function from Stevenson (1980), the model is estimated with maximum likelihood method The model can be described as: ln � �� = � + ���� �� + � �� − � � (2.4.6)
*Note: Battese and Coelli (1988) use a Cobb-Douglas function
METHODOLOGY
Overview of Vietnamese metal manufacturing industry
This study categorizes firms into two groups based on their main products: basic metal manufacturing and fabricated metal manufacturing (excluding machinery and equipment) According to the International Standard Industrial Classification (ISIC) Revision 4, basic metal manufacturing firms, which represent only 9% of the sample, engage in smelting and refining ferrous and non-ferrous metals, requiring significant investments in physical assets In contrast, fabricated metal manufacturing firms, accounting for approximately 91% of the sample, produce more widely used items such as structural metal products, metal containers, and steam generators.
The metal manufacturing industry in Vietnam holds significant potential due to the high demand for metal products in daily use, production, and construction Despite its promising prospects, the industry remains underdeveloped, with approximately 80% of iron and steel materials utilized for construction purposes, according to the World Steel Association Additionally, the growing domestic demand for metal materials in machinery, motors, automobiles, and consumer goods further supports the industry's growth However, the sector faces challenges, including a downturn in the Vietnamese economy and reduced investment in public construction due to government budget constraints, which has led to a nine percent decline in steel consumption in 2012, as reported by the Vietnamese Steel Association Furthermore, rising input costs for electricity, water, and labor continue to strain the industry's operations.
This study aims to analyze the technical efficiency of firms in the crucial metal manufacturing industry Notably, a significant portion of the sample consists of micro and household firms, accounting for 74.5%, while medium-sized firms represent only 4% The analysis is based on data collected over a specific period, highlighting the current landscape of the sector.
Between 2005 and 2009, economic conditions were notably different from today, which raises concerns about potential sample bias in this study's conclusions regarding the industry Consequently, readers should approach the findings with caution, particularly when considering their application for policy recommendations.
Analytical framework
In Chapter II, panel-data stochastic frontier models are utilized to assess the technical inefficiency levels of Vietnamese SMEs within the metal manufacturing sector The production function is estimated using both Cobb-Douglas and Translog functional forms, specifically focusing on the technical inefficiency effects model developed by Battese and Coelli.
In 1995, a set of firm-specific variables was incorporated into the model to identify sources of technical inefficiency The results were then analyzed across different models to assess how various assumptions and model specifications influence the determination of technical efficiency.
Research method
Firm efficiency is assessed using the Stochastic Frontier Model, which requires selecting an appropriate functional form for the frontier Various production functions can be considered, including Linear, Cobb-Douglas, Quadratic, Normalized Quadratic, Translog, Generalized Leontief, and Constant Elasticity of Substitution (CES) forms According to Coelli et al (2005), an effective functional form should be flexible, linear in parameters, regular, and parsimonious The Linear and Cobb-Douglas forms are classified as first-order flexible, while the others are second-order flexible, with most production functions being linear in parameters Both Cobb-Douglas and Translog functions can achieve linearity in parameters through logarithmic transformation A regular functional form adheres to the economic properties of production functions, while a parsimonious function is the simplest adequate solution Researchers must navigate the trade-off between flexibility and parsimony when selecting a functional form.
Researchers often prefer the Cobb-Douglas function for its simplicity and ease of use, while the Translog function is favored for its flexibility The Cobb-Douglas form maintains constant production elasticity and factor substitution elasticity, whereas the Translog form allows for variability, making it more realistic and less restrictive for testing production function properties However, the Translog model's inclusion of cross and squared terms increases the number of parameters, which can lead to potential correlations among them Additionally, a limited number of observations can reduce the degree of freedom due to this parameter increase A comparison of the Cobb-Douglas and Translog functions can be illustrated using three inputs: capital (K), labor (L), material (M), and indirect cost (I).
Cobb-Douglas functional form: ln � �� = � 0 + � 1 ln � �� + � 2 ln � �� + � 3 ln � �� + � 4 ��� ��
Translog functional form: ln � = � + � ln � + � ln � + � ln � + � � + 1 × [� ln � ln � +
� 6 ln � �� ln � �� + � 7 ln � �� ln � �� + � 8 ln � �� ln � �� + � 9 ln
� 10 ln � �� ln � �� + � 11 (ln � �� ) 2 + � 12 (ln � �� ) 2 + � 13 (ln � �� ) 2 +
With the denominator i denotes firms, t denotes time periods; � �� is output; � �� is capital input; � �� is labor input; � �� is materials; � �� is indirect costs, � �� stands for statistical noise, which follows
�(0, � 2 ); � stands for TE, which follows specific non-negative distribution mentioned above. Obviously, without squared and interaction terms, the Translog function becomes Cobb-Douglas
The Cobb-Douglas functional form is characterized by constant proportionate returns to scale and constant elasticity of factor substitution, assuming that all input pairs are complementary These assumptions contribute to its restrictive nature.
A likelihood ratio test (LR) can be used to test for goodness of fit between these two functional forms with:
� 1 : otherwise where �(� 0 ) is the log-likelihood value for null model (� 0 ) and �(� 1 ) is the log-likelihood value for alternative model (� 1 ), the test is given by:
The test statistic is approximately followed a chi-squared distribution with the degree of freedom (df) is the difference between df of null model and df of alternative model.
This thesis employs two computer programs, STATA and FRONTIER 4.1, to estimate Stochastic Frontier Models Since the introduction of the model, STATA has significantly advanced its commands for estimating Stochastic Frontier Models, specifically utilizing "frontier" for cross-sectional data analysis.
“xtfrontier” for panel data are those popular STATA commands to estimate technical efficiency.
The "frontier" command in STATA is capable of handling models that follow half normal, truncated normal, exponential, or gamma distributions Meanwhile, "xtfrontier" addresses both time-invariant and time-varying models as proposed by Battese and Coelli in their 1988 and 1992 studies, respectively However, users often encounter challenges when testing alternative models with just these two commands Fortunately, the recent work by Belotti, Daidone, Ilardi, and Atella (2012) has introduced the "sfcross" and "sfpanel" commands, significantly enhancing STATA's functionality for these models with simplified syntax Additionally, FRONTIER 4.1 by Coelli (1996) is equipped to manage models from Battese and Coelli's 1992 and 1995 research, providing further versatility for users.
In analyzing logarithmic functional forms such as Cobb-Douglas and Translog, technical efficiency will be assessed using two distinct methods The first method, as proposed by Jondrow et al (1982), utilizes the formula: \( TE = \exp[-\lambda(u|y)] \) The second method, introduced by Battese & Coelli (1988), employs the formula: \( TE = \mu[\exp(-\lambda)] \).
The calculated TE, derived from this equation, yields a positive value below one, representing the ratio of actual output to the maximum output achievable without any inefficiencies.
It means TE is a comparison between the output of a real firm and the output of an efficient firm.
The estimation of time-invariant fixed and random effects models, as outlined by Schmidt and Sickles (1984), parallels the methodology used in panel regressions with fixed and random effects These models are estimated using least squares methods, specifically the within estimator and Generalized Least Squares (GLS) Following the calculations, each firm's effect is assessed against the highest value in the sample, allowing for the estimation of inefficiency as �̂ �.
The model posits that there exists an efficient firm within the sample exhibiting a technical efficiency level of 100% (� = 0) This approach aligns with other models estimated using least squares methods, including those developed by Cornwell et al.
Models with fixed effects, such as those proposed by Cornwell et al (1990) and Lee and Schmidt (1993), often include numerous parameters that can introduce biases due to coincident parameter situations Specifically, Cornwell et al.'s model incorporates a complex time function of technical inefficiency that is not applicable to datasets with only three time periods Additionally, while the command developed by Belotti et al (2012) for Lee and Schmidt's model aims to measure technical efficiency, it inaccurately compares efficiency levels to the highest level across all years rather than within each individual year, resulting in potential confusion Consequently, the technical efficiency levels derived from this model will not be included in our results.
The models developed by Pitt and Lee (1981) and Battese and Coelli (1988) are both estimated using the maximum likelihood method, but they differ in their distribution assumptions Pitt and Lee assume a half-normal distribution, whereas Battese and Coelli utilize a truncated normal distribution for their analysis.
The likelihood functions for these two models are detailed in their original research papers The second model is more comprehensive as it incorporates an additional parameter, μ, which represents the mean of the normal distribution.
� takes the truncation) The former one is essentially a special case of the latter when � = 0.
The models developed by Kumbhakar (1990) and Battese & Coelli (1992) exhibit similar characteristics, as both utilize the maximum likelihood estimation method and consider technical inefficiency as a time-dependent variable Kumbhakar's model defines the time function as \( \mu_t = g(t) \cdot v_t \), where \( g(t) = (1 + \exp(\alpha + \beta t^2))^{-1} \), with \( \alpha \) remaining constant over time but varying across firms, and \( v_t \) following a half-normal distribution In contrast, Battese & Coelli's approach employs the function \( \mu_t = \lambda_t \cdot v_t \), characterized by \( \lambda_t = \exp[-\gamma(t - \theta)] \).
� � ��� |�(�, � � )| (truncated normal distribution at zero) Those functional forms let data decide the time behavior of � The one in Battese and Coelli (1992) is simpler in calculation but the one in
Kumbhakar (1990) is more flexible in showing the dynamics of technical inefficiency.
The technical inefficiency effects model proposed by Battese and Coelli (1995) incorporates key variables such as age, size, ownership type, and firm location, highlighting the influence of a firm's age on technical inefficiency while also considering the impact of time This model simultaneously estimates the Stochastic Frontier Model and the technical inefficiency effects model, thereby eliminating biases associated with two-step estimation as discussed by Wang and Schmidt (2002) Additionally, Greene (2005) introduces "true" random and fixed effects models, utilizing an exponential distribution for technical inefficiency The authors advocate for a "brute force" approach, applying the maximum likelihood method to estimate all parameters, including constant terms, concurrently within the "true" fixed effects model, with the likelihood function detailed in their original works.
In Greene (2005), the focus on "true" fixed and random effects primarily emphasizes technical inefficiency, treating all firm heterogeneity as unobservable This study adopts a more comprehensive approach to these models, presenting a general form for both "true" fixed and random effects The "true" fixed model is represented as ln Y_it = β + ln X_it + u_i + v_it - e_it, while the "true" random model is expressed as ln Y_it = (α + β_i) + ln X_it + u_i + v_it - e_it This framework effectively distinguishes between three key factors: technical inefficiency (e_it), observable heterogeneity (X_it), and unobservable heterogeneity (u_i in the fixed model and β_i in the random model).
This sector gives a brief description of variables in Stochastic Frontier Model and Technical Inefficiency Effects Model mentioned above.
3.2.1.Variables in Stochastic Frontier Model