Multivariate analysis to identify groups of biodiesels in the operation of an agricultural engine

The increase in the use of diesel engines in the transport sector and in agriculture has led researchers to study new alternative sources of fuels. The objective was to identify the agglomerates existing in a set of biodiesels through multivariate exploratory analysis and determine the variables that most influenced the division of agglomerates to assist in the understanding of the functioning of the engines with these biodiesels. The present work was carried out in two tests: test I tractor performance test and test II exhaust smoke opacity. The tractors used in the experiment were the so-called traction tractor (tractor 1), the Valtra brand, model BM100, 4X2 with auxiliary front-wheel drive, 74 kW (100 hp) at 2,300 rpm on the engine and braking tractor (Tractor 2), the Valtra brand, model BH140, 4X2 with auxiliary front-wheel drive, 103 kW (140 hp) at 2,400 rpm on the engine. The hierarchical, non-hierarchical analysis and the main component analysis grouping allowed the ordering of the biodiesels into three groups. The variables that most influenced the division of the groups were volumetric hourly consumption, hourly weight consumption, and specific consumption. The biodiesels that are part of group one have greater calorific power resulting in less consumption of biodiesel in agricultural operations.


Introduction
The increase in agricultural mechanization has made agricultural practices in the field more efficient and feasible, and the tractor is one of the main machines used in the execution of these activities (Martins et al., 2018). However, the increased use of agricultural machinery has resulted in higher energy demand, such as diesel, a fuel derived from petroleum that is a non-renewable energy source (Araújo et al., 2014).
In an attempt to meet the demand for fuels, along with ecological awareness, researchers Cunha et al., 2015;Guimarães et al., 2018) have been studying the use of biofuels that cause less damage to the environment and supply the needs of machines.
Among biofuels, biodiesel stands out for having physicochemical characteristics similar to diesel, in addition to minimizing the emission of pollutants during combustion and being used without mechanical alteration in diesel cycle engines (Guimarães et al., 2018). Biodiesel is considered a renewable fuel because the plants that are used as feedstock consume carbon, thus offsetting the carbon emitted during combustion .
In assessing the feasibility of biodiesel, the operation of machines with this biofuel produced from various raw materials is studied (Lima et al, 2015), as well as the amount of particulate material emitted in the smoke by means of opacity that can vary depending on the type of material used in the production of biodiesel, according to the results obtained by Neves et al. (2018) when evaluating the performance of a tractor with biodiesels produced from soy and murumuru (Astrocaryum murumuru Mart.), they observed that the content of particulate material emitted in the smoke of murumuru biodiesel was significantly lower than from soy biodiesel.
To evaluate the operation of machines with biodiesel it is necessary to use a set of variables that help in the understanding of how a particular machine works with a particular biodiesel, however this becomes complex when you want to evaluate a set of biodiesel from different raw materials. In this sense, to minimize the existing complexity of evaluating a set of biodiesels, multivariate analysis can be used as an exploratory statistical method to simultaneously analyze all measurements in the experimental unit (Noronha et al., 2018).
In multivariate data analysis, cluster analysis (hierarchical and non-hierarchical) is widely used to extract the statistical properties of a data set by grouping similar vectors into classes. Another widely used analysis is principal component analysis, which allows for the simplification of the description of a set of interrelated variables (Noronha et al., 2018).
In view of the above, the aim was to identify the clusters existing in a set of biodiesels through multivariate exploratory analysis and determine the variables that most influenced the division of clusters to assist in understanding the operation of engines with these biodiesels.

Materials and Methods
The study was carried out in the Biocombustível e Ensaio de Máquinas (BIOEM), at the Universidade Estadual Paulista, Campus Jaboticabal, SP, Brazil. The geographical location of the test area is defined by the coordinates 21º15' S and 48º18' W, with an average altitude of 570 m.
The biodiesels used were produced in the Laboratório de Desenvolvimento de Tecnologias Limpas (LADETEL), of the University of São Paulo, Ribeirão Preto, SP, Brazil.
The present work was carried out in two trials: test Iconsisted of a tractor performance test, and test II -static, testing the opacity of the smoke from the exhaust pipe of an agricultural tractor as a function of the fuel used.
For both trials, the experimental design was entirely randomized, with 26 treatments and 3 repetitions, totaling 78 observations. The treatments were 26 types of biodiesels (Table 1).
In test I, a braking tractor joined to the traction tractor by a steel cable, forming a train, was used to obtain a controlled load on the drawbar of the test tractor of 20 kN.
The tractors used in the experiment were the so-called pull tractor (tractor 1), of the Valtra brand, model BM100, 4X2 with auxiliary front wheel drive, 74 kW (100 hp) at 2,300 rpm on the engine, equipped with 14.9-24 tires on the front axle and 23.1-26 on the rear axle, and such tractor was instrumented for the test according to Neves et al. (2018). The so-called braking tractor (tractor 2), Valtra brand, model BH140, 4X2 with auxiliary front-wheel drive, 103 kW (140 hp) at 2,400 rpm on the engine, used to provide 20 kN resistance to the traction tractor, engaged in 3 rd gear.
The fuel consumption was measured, in each plot, in unit volume (mL), obtaining the total feeding volume at the injection pump inlet and the total returned volume, the fuel consumed being the difference between the two measurements. The fuel consumption measurement system consists of two assemblies, one for feeding the injection pump and the other for the return. Each set contains an Oval Corporation flow meter, model Flowmate LSN 48, with 1% accuracy over the nominal flow rate and a maximum flow rate of 100 L h -1 , and a PT 100 resistive type temperature sensor (100 ohms resistance at 0 °C) with a measurement range of -200 to 800 °C.
The drawbar force was obtained using a M. Shimitsu load cell, model TF 400, with a nominal scale of 0 to 100 kN, coupled to the drawbar, with the force values obtained in kN. The data concerning traction force and fuel consumption were monitored and stored in the Campbell Scientific data acquisition system, Microlloger CR23X model, and later transferred to a computer.
The actual displacement speed was measured directly by radar model RVS II. With this data, drawbar power, hourly volumetric, weight, and specific fuel consumption were determined.
The drawbar power was determined indirectly according to Equation 1.
Volumetric hourly consumption was measured based on the volume consumed and the travel time in each plot, according to Equation 2.
To calculate the hourly weight consumption, the influence of the density of the feed and return fuel at the time of the test was considered, according to Equation 3.
Specific consumption is the fuel consumption expressed in unit mass per unit power required at the drawbar, as per Equation 4.
For test II, referring to the opacity of the smoke coming from the burning of fuel in the tractor engine, a Tecnomotor model TM 133 light absorption opacimeter was used, which is compatible with the NBR 13037 standards, Inmetro, CEE 72/306 . (1) (3)

4/6
The opacimeter was connected to a Tecnomotor serial controller, model TM 616, which received the sensor signals and converted them into a measurement unit. This equipment is used to export the converted data to computers through a serial connection, whose interface is the software called IGOR, which manages the tests performed according to the free acceleration method, a regime in which the engine is submitted to the maximum fuel flow, and the power developed is absorbed only by the inertia of the engine's mechanical components (clutch, gearbox pilot tree), once the vehicle is stationary. Opacity measurements are made in K, light absorption coefficient, unit of measurement m -1 (Tecnomotor, 2012).
Smoke opacity was determined on a Valtra test tractor, model BM100. At the end of each determination, the feeding system was completely drained to avoid contamination of the next test. In addition, after the fuel was changed, the engine was run for ten minutes before the start of each test.
For the analysis, the data were standardized (unit variance and null mean) before the multivariate analysis was performed. In the exploratory statistics of the data, the analysis of hierarchical (dendrogram) and non-hierarchical clustering was performed by the K-means method, with k is the number of groups indicated in the dendrogram, both analyses using the Ward method as a Euclidean clustering and distance strategy as a similarity coefficient a. After identifying the groups, Hotelling's multivariate T 2 test was used to check the significance (p < 0.05) between them.
Subsequently, the principal components were analyzed, with the objective of visualizing the groups of biodiesels in the two-dimensional plane formed by the principal components and interpreting the discriminatory power of the variables in each principal component. All analyses and graphs were performed in Statistica software, version 7.0 (Statsoft, 2020).

Results and Discussion
The cluster analysis allowed the ordering of the 26 types of biodiesels into three groups (Figure 1), named Group 1 (consisting of 5 types of biodiesels with a similarity index of 2), Group 2 (consisting of 7 biodiesels, with a similarity index of 6), Group 3 (consisting of 14 biodiesels and similarity index of 3). Hotelling T 2 test showed significance (p < 0.001) between the groups. According to Noronha et al. (2018) cluster analysis makes it possible to extract the statistical properties of the data set by grouping similar vectors into classes.  Figure 2 show the results of the non-hierarchical cluster analysis (k-means), which originated from the division of the biodiesels into three groups in the hierarchical analysis (dendrogram). The partitioning of the groups in this analysis uses maximizing the inter-group variance and reducing the intra-group variance as criteria.
In Table 2 it can be seen that the variables power on the drawbar showed the highest significant F value, followed by volumetric hourly consumption, weight and travel speed, indicating that these variables are the main criteria for assigning objects (biodiesels) to the groups (Cardozo et al., 2014).  Table 2. Analysis of variance for the variables that characterized the groups of biodiesels formed by non-hierarchical clustering.

5/6
In Figure 2 it can be seen that the three groups follow distinct patterns in the variables. Group one is formed by biodiesels from soy and murumuru, and the variables that best characterized them were low volumetric hourly consumption (VHC), hourly weight consumption (HWC) and smoke opacity (SO), and high displacement velocity (HDV).
Group two is similar to group three in volumetric and weight hourly consumption, but is characterized by low specific consumption (SC) and high values of the other variables. While group 3 has high SC and low values for travel speed (TS) and DP. Neves et al. (2013) associates the high value of volumetric hourly consumption to the low calorific value of biodiesel, in this sense the high values of HCV in groups two and three may be related to the calorific value of the biodiesels that make up the two groups, thus indicating that the biodiesels that make up group one have a higher calorific value than the others.
The analysis of the main components of the variables related to fuel consumption, take-off speed and drawbar power obtained in the 26 types of biodiesel used in the study explained 81.25% of the variance contained in the original data set, ( Table 3).
The principal component 1 (PC1), explains 50.90% (eigenvalue equal to 3.05) of the variance, and the variables with the highest discriminatory power in PC1 were: HVC (0.62), SC (0.88), TS (-0.91) and DP (-0.83). In PCA, variables with equal signs interact directly, that is, with the increase in the value of one, there is an increase in the value of the other, or vice-versa, while variables with different signs interact inversely, so that while one increases the value, the other decreases it. In this way, volumetric hourly consumption and specific consumption act directly and inversely to travel speed and drawbar power. The principal component 2 (PC2), explains 30.35% (eigenvalue equal to 1.82) of the variance, with HVC (0.72), HWC (0.75) and SO (0.62) being the variables with the greatest discriminating power acting directly.
In the biplot graph ( Figure 3) constructed from the scores of the variables and the factorial loadings of PC1 and PC2, one can see the discrimination of the biodiesels located on the right and left (PC1), top and bottom (PC2), (Hair et al., 2005). According to Sousa et al. (2018), the influence of a vector in the analysis is measured by its length. In this context, for the present study with the exception of smoke opacity all variables have the same size and consequently the same influence.
Three distinct groups are observed, similar to the hierarchical cluster analysis for the 26 types of biodiesels ( Figure 3). Group one, which is formed by soy biodiesels, murumuru and blends of murumuru and soy containing more than 80% murumuru, is located to the left at the bottom of the principal components 1 and 2 respectively, and this group has little influence of HVC and HWC, as it contains low consumption values relative to the other groups, while at the same time presents average values for TS, DP, SO and SC. Corroborating Iamaguti, (2017) when working with blends of murumuru and soybeans in agricultural tractor, observed that in blends with murumuru proportion above 80% runs reduction in hourly volumetric and weight consumption.
Group two is made up of biodiesels from buriti, babassu, tucumã and soy and murumuru blends containing more than 30% soy, located on the left at the top of main components 1 and 2 respectively, being strongly influenced by the high value of TS, DP and SO, at the same time is little influenced by SC, presenting low values of specific consumption in relation to the other groups ( Figure 2). Similar characteristics for buriti, babassu, tucumã biodiesels for the TS and DP variables were verified by Lira (2018), when studying the performance of agricultural tractor using palm biodiesel, they found no difference between the three types.
Group three consists of biodiesels produced from peanut, sunflower, palm, castor and residual oils from the UNESP university restaurant and cafeteria are located to the right in CP1, divided between the top and bottom of CP2. This group is characterized by high values in volumetric, weight and specific consumption, with little influence from the TS, DP and SO variables. These results corroborate those obtained by  when studying the operational behavior of tractor as a function of the type of biodiesel, observed that castor and palm biodiesel have similar characteristics in volumetric, weight and specific consumption, as observed in the present work.

Conclusions
Hierarchical and non-hierarchical cluster analysis and principal component analysis made it possible to sort the biodiesels into three groups.
The variables that most influenced the division of the groups were volumetric hourly consumption, hourly weight consumption, and specific consumption. Table 3. Correlation between the principal components and the variables studied.

6/6
The biodiesels that are part of this group are characterized by having higher calorific power resulting in lower consumption of biodiesel in agricultural operations.