Equivariant adjusted least squares estimator in two-line fitting model

We consider the two-line fitting problem. True points lie on two straight lines and are observed with Gaussian perturbations. For each observed point, it is not known on which line the corresponding true point lies. The parameters of the lines are estimated. This model is a restriction of the conic section fitting model because a couple of two lines is a degenerate conic section. The following estimators are constructed: two projections of the adjusted least squares estimator in the conic section fitting model, orthogonal regression estimator, parametric maximum likelihood estimator in the Gaussian model, and regular best asymptotically normal moment estimator. The conditions for the consistency and asymptotic normality of the projections of the adjusted least squares estimator are provided. All the estimators constructed in the paper are equivariant. The estimators are compared numerically.


Two-line fitting model
Consider a problem of estimation of two lines by perturbed observations of points that lie on the lines. Let the true points (ξ i , η i ) lie on the union of two different lines η = k 1 ξ + h 1 and η = k 2 ξ + h 2 , that is, Let these points be observed with perturbations (δ i , ε i ), i = 1, . . . , n, that is, the observed points are (x i , y i ), i = 1, . . . , n, with The perturbations are assumed to be independent and identically normally distributed, where I is the 2 × 2 identity matrix. The parameters k 1 , h 1 , k 2 , h 2 , and σ 2 are to be estimated. We consider both functional and structural models. In functional model, the true points are assumed to be nonrandom. In structural model, the true points are assumed to be independent and identically distributed (i.i.d.). The errors (δ i , ε i ) are i.i.d. and independent of the true points.
In the structural model, (ξ i , η i , δ i , ε i ) are i.i.d. random vectors, and thus, the observed points (x i , y i ) are i.i.d. In the functional model, the observed points are independent, Gaussian, with different means but with common covariance matrix. Remark 1. The true lines defined by Eqs. (1) cannot be parallel to the y-axis. In order to avoid overflows during evaluation of the estimators (except of RBAN-moment estimator), another parameterization is used internally: τ ⊤ (ζ − ζ 0 ) = 0, where τ is a unit vector orthogonal to the line, and ζ 0 is a point on the line. The computation of the RBAN-moment estimator (see Section 2.4) is implemented for explicit parameterization only. Computational optimization of the RBAN-moment estimator is a matter of further work.
The explicit parameterization has the advantage that the number of parameters is equal to the dimension of parameter space. (In [12], the second-order equation (5) has six unknown coefficients, but the conic section can be parameterized with five parameters. The parameter space for the parameters of the conic section was the five-dimensional unit sphere in the six-dimensional Euclidean space. Mismatch between the number of parameters and the dimension of the parameter space made the asymptotic covariance matrix of the estimator singular.) In simulations, the confidence intervals for the coordinates of the intersection point of the two lines are obtained based on the asymptotic covariance matrix for the intersection point. For the projections the ALS2 estimator, that asymptotic covariance matrix can be evaluated without use of explicit line parameterization.
Hereafter, a second-order algebraic curve is called a "conic section" or a "conic." The points are observed with Gaussian perturbations, and the perturbed points are denoted as (x i , y i ). We have the same equations as (2)-(4) in the two-line fitting model. The vector of coefficients in (5) is denoted by β = (A, 2B, C, 2D, 2E, F ) ⊤ . The nonzero vector β and the error variance σ 2 are the parameters of interest.
Similarly to the two-line fitting model, the functional and the structural models are distinguished.
A couple of lines is a degenerate case of a conic section. Therefore, the conic section fitting model is an extension of the two-line fitting model.

ALS2 estimator in conic section fitting model
We consider the adjusted least squares (ALS) estimator for unknown σ 2 . The estimator is constructed in [7]. Introduce the 6 × 6 symmetric matrix Asterisks are typed instead of some entries above the diagonal of a symmetric matrix. The entries of the matrix ψ(x, y; v) are generalized Hermite polynomials in x and y. The matrix ψ(x, y; v) is constructed such that E ψ(x i , y i ; σ 2 ) = ψ(ξ i , η i ; 0) in the functional model and ψ(ξ i , η i ; 0)β = 0 for the true points and true parameters. Denote The estimatorσ 2 of the error variance σ 2 is obtained from the equation Equation (6) always has a unique nonnegative solution. If n ≥ 6, then the solution to (6) is positive almost surely.
The matrix Ψ n (σ 2 ) is singular. Define the estimatorβ of the vector β as a nonzero solution to the equation Ψ n σ 2 β = 0.
The strong consistency of the ALS2 estimator is proved in [7] and [11] under somewhat different conditions. The asymptotic normality is proved in [12] for the functional model and in [13] for the structural model. Two consistent estimators of the asymptotic covariance matrix are constructed in [13].
Denote the normalized version of the true parameter Normalize the estimator of β in such a way that β ⊤ Ψ ′ n (σ 2 ) β = −n and β ⊤ β ≥ 0. Therefore, denote Proposition 2. 1. Under the conditions of Proposition 1, the estimator β is a strongly consistent estimator of β tn = (−β ⊤ Ψ ′ ∞ β) −1/2 β, that is, β → β tn a.s. 2. In the functional model, for all integer p ≥ 0 and q ≥ 0 such that p + q ≤ 6, let the following limits exist and be finite: whereas in the structural model, let E ξ 6 1 < ∞ and E η 6 1 < ∞. In both models, let rank Ψ ∞ = 5. Then the estimatorθ = ( β ⊤ ,σ 2 ) ⊤ is asymptotically normal in the following sense: where 3. Under the conditions of part 2 of Proposition 2, the following estimator of the asymptotic covariance matrix is consistent: in probability.

Estimation methods
The methods of fitting an algebraic curve (or surface) to observed points can be classified as follows.
Algebraic distance methods, where the residuals in the equations for the algebraic curve are minimized. For example, the minimum point of the sum of squared residuals n i=1 (Ax 2 i + 2Bx i y i + Cy 2 i + 2Dx i + 2Ey i + F ) 2 (with some normalizing constraint in order to avoid A = B = . . . = F = 0) in the conic fitting problem and 2 in the two-line fitting problem is called the ordinary least squares (OLS) estimator.
The criterion function for the OLS estimator is simple enough and can be adjusted so that the resulting estimator is consistent (under some conditions). Such an estimator is called the adjusted least squares (ALS) estimator. The OLS and ALS estimators are method-of-moments estimators, meaning that the criterion functions for the estimators are polynomials whose coefficients are sample moments of coordinates of the observed points. Hence, the OLS and ALS estimators can be computed efficiently.
In order to obtain parameters of two lines, the observed points are fitted with a conic section, and then the parameters of the conic section are used to obtain the parameters of two lines. There are some papers where this idea is used.
The problem of estimating the fundamental matrix for two-camera view is considered in [6]. The fundamental matrix is a singular matrix whose left and right nullvectors are the coordinates of each camera in the coordinate system of the other camera. Initially, the ALS estimator of the fundamental matrix is evaluated. Then it is projected so that the estimated fundamental matrix is singular.
In [14] the problem of segmentation of a finite-dimensional vector space onto linear subspaces is considered, and the generalized principal component analysis method is introduced. The sample is fitted with an algebraic cone (a set of points that satisfy a homogeneous algebraic equation) by the OLS method. Then subspaces are extracted from the algebraic cone with use of a small learning sample. An application of segmentation of a vector space onto hyperplanes for searching planes on binocular image is given in [16].
In [15] an ellipsoid fitting problem with a constraint such that a center of the ellipsoid lies on a given line is considered. The algebraic distance with embedded constraint is minimized. The analytical (behavioral) properties of the optimization problem are studied. We consider a conic section fitting problem but with different constraint-the conic is degenerated to a couple of straight lines.
Geometric distance methods, where distances between the estimated curve and each point are minimized. The sum of squares of those distances is minimized, and the orthogonal regression (OR) estimator is obtained.
A numerical algorithm for evaluation of the orthogonal regression estimator is presented in monograph [1].
The orthogonal regression is consistent in the single straight line fitting problem [3, Section 1.3.2(a)]. In nonlinear models, the estimator may be inconsistent. There is a one-step correction procedure in explicit and implicit models [5,9] with application in the ellipsoid fitting model [9]. However, in the two-line fitting model, the correction from [9] is unstable.
Probabilistic methods. They are used to obtain the maximum likelihood (ML) estimator and Bayes estimators.

Notation
Let {A n , n = 1, 2, . . .} be a sequence of random events. The random event A n is said to hold eventually if almost surely there exists n 0 such that A n occurs for all n ≥ n 0 . In other words, the random event A n holds eventually if and only if it does not occur only for finitely many n almost surely.
The estimatorβ is called asymptotically normal if √ n(β − β true ) → N (0, Σ) in distribution, were the asymptotic covariance matrix Σ may be singular, and n is the sample size. This definition differs from the conventional one adopted in asymptotic theory because here only √ n-asymptotic normality is considered. Let ζ ∼ N (µ, Σ) be a bivariate random vector. Then E = {z : (z − µ) ⊤ Σ −1 (z − µ) ≤ 1} is called the 40% ellipsoid of the normal distribution because P(ζ ∈ E) ≈ 0.3935. This is the ellipsoid where the probability density function is at least 0.3679 of its maximum.

Outline
In Section 2, we construct five estimators for parameters of the two line fitting model. In Section 3, we propose two definitions of the equivariance of an estimator and state that all of the five estimators are equivariant. The estimators are compared numerically in Section 4. The proofs are given in Appendix A.

ALS2 estimator and its projections
The two-line fitting model is a restriction of the conic section fitting model. A couple of lines defined by the equation (k 1 ξ − η + h 1 )(k 2 ξ − η + h 2 ) = 0 is a degenerate conic section Aξ 2 + 2Bξη + Cη 2 + 2Dξ + 2Eη + F = 0, with coefficients with a constraint C = 0. The conic section ALS2 estimator provides estimation of the error variance σ 2 and the coefficients A, B, . . . , F .
Denote by ν(i) ∈ {1, 2} the indicator of a line which the true point (ξ i , η i ) belongs to. Equation (1) can be rewritten as The indicator ν(i) is nonrandom in the functional model, and it is a random variable in the structural model.
Then the ASL2 estimatorsβ andσ 2 are strongly consistent in the sense of (7) and (8).
There are two cases where the structural model is not identifiable. If the common distribution of the true points is concentrated on a straight line and on a single point (presumably not on the line), that is, then there are many ways to fit the true points with two lines. If the common distribution of the true points is concentrated in four points, that is, then there are three ways to fit the true points with two lines (unless three of the four points lie on a straight line, which is a particular case of (13)).

Proposition 4.
In the structural model, assume that E |ξ 1 | 3 < ∞ and that nonidentifiability conditions (13) and (14) do not hold. Then the ALS2 estimator is strongly consistent in the sense of (7) and (8).
In order to estimate the parameters k 1 , h 1 , k 2 , and h 2 , we can solve Eqs. (12). With ignoring the last equation F = Ch 1 h 2 , the solution is Substituting the elements of the ALS2 estimatorβ = (Â, 2B,Ĉ, 2D, 2Ê,F ) ⊤ into the right-hand side of (15)-(17), we obtain an "ignore-F " estimator: If the conic section estimated by the ALS2 estimator is a hyperbola, then the "ignore-F " estimate of the two lines comprises the asymptotes of the hyperbola.
Choose the sign ± in (18) such thatk 1 <k 2 . We need the notation for the function that expresses the line parameters k 1 , h 1 , k 2 , h 2 in elements of β and is defined by (15)-(17). With this notation, we can write

Proposition 5.
In the functional model, assume the following: Then the "ignore-F " estimator of the parameters of two lines is strongly consistent, that is,k j → k j ,ĥ j → h j , j = 1, 2, as n → ∞ almost surely.

Proposition 7.
In the functional model, assume the following: • for j = 1, 2 and p = 0, 1, . . . , 6, the following limits exist and are finite: • for j = 1 and j = 2, the matrices  Then the "ignore-F " estimator (k 1 ,ĥ 1 ,k 2 ,ĥ 2 ) ⊤ is asymptotically normal, namely where Σβ is the asymptotic covariance matrix ofβ, and K is the 4 × 6 matrix of derivatives of the mapping at the true parameters β tn , that is, The matrix KΣβK ⊤ is nonsingular.
. , E are multiplied by a common factor. So it does not matter which normalization of β is used.

Orthogonal regression estimator
The sum of squared distances between each observed point and the closer of two lines is equal to The orthogonal regression estimator is a Borel-measurable function of observations such that In the functional model, the orthogonal regression estimator is the maximum likelihood estimator. However, because the dimension of parameter space grows as the sample size is increasing, the orthogonal regression estimator may be inconsistent.

Parametric maximum likelihood estimator
The estimator is constructed in the structural model, so it should be called the structural maximum likelihood estimator.
If a Gaussian distribution of a random point (ξ, η) is concentrated on a straight line η = kξ + h, then it is a singular normal distribution: where µ ξ and σ 2 ξ are the expectation and variance of the random variable ξ. Note that the covariance matrix σ 2 ξ 1 k k k 2 is singular and positive semidefinite. If the distribution of a random point (ξ i , η i ) is concentrated on two straight lines η = k 1 ξ + h 1 and η = k 2 ξ + h 2 and the distribution on each line is Gaussian, then, due to (25), the conditional distributions are for j = 1, 2. The matrices Σ 0j are positive semidefinite and singular, that is, λ min (Σ 01 ) = λ min (Σ 02 ) = 0, and the points µ j are the centers of Gaussian distribution of the points on each line. The distribution of (ξ i , η i ) is a mixture of two singular normal distributions where p = P(ν(i) = 1) = P(ν(1) = 1) is the probability that the point (ξ i , η i ) lies of the first line. The distribution of the observed points is also a mixture of two Gaussian distributions The likelihood function for the sample of points with a mixture of two normal distributions is is the density of a bivariate normal distribution. One method of evaluating the maximum likelihood estimator is as follows: 1. Find the point of conditional minimum

Setσ
Hereσ jxx ,σ jxy , andσ jyy are the entries of the matrix Σ j , andμ jx andμ jy are the elements of the vector µ j : 3. Find the estimatesk 1 ,ĥ 1 ,k 2 ,ĥ 2 from the equations The denominatorσ jxx −σ 2 may be equal to 0 with some positive probability. Occurrence of this event means that the estimated figure is a straight line and a single point outside the line rather than two straight lines.
In order to make the statement of consistency easier, assume that k 1 < k 2 and choose the estimator such thatk 1 ≤k 2 .

RBAN moment estimator
The regular best asymptotically normal (RBAN) estimators were developed by Chiang [4]. Our RBAN moment estimator differs from the original RBAN so that not only the observed points (x i , y i ), but also monomials x p i y q i , p + q ≤ 4, are averaged. Introduce the 14-dimensional vectors whose elements are the monomials of coordinates of observed points: m(x, y) = x 4 , x 3 y, x 2 y 2 , xy 3 , y 4 , x 3 , x 2 y, xy 2 , y 3 , x 2 , xy, y 2 , x, y ⊤ , Evaluate the average and sample covariance matrix of the vectors m i : Denote , where ξ, δ, and ε are independent random variables such that E ξ q = µ q and (δ, ε) ⊤ ∼N (0, σ 2 I).
Basically, the function f 1 is defined for all µ p , p = 1, . . . , 4, that comprise possible 4-tuples of moments of a random variable, that is, satisfy see [10]. However, since the elements of the vector-function f 1 are polynomials of its arguments, it can be extended to R 7 . Denote In the structural model, Consider the equation It is a system of 14 equations in 14 variables. If (33) has a solution, then the moment estimator can be defined as one of the solutions. However, (33) may have no solution.
In the rest of Section 2.4, µ This minimization problem is similar to that in Theorem 6 in [4]. The minimum can be evaluated explicitly, and this allows us to reduce the dimension of minimization problem. The reduction of dimension of the optimization problem was used, for example, in [8].
The routines evaluating the RBAN-moment estimator and the estimator for its covariance matrix are developed without rigid theoretical basis; see Section 4.3.

Two definitions of equivariance
The similarity transformation of R 2 is where U is an orthogonal matrix, K = 0 is a scaling coefficient, and ∆z ∈ R 2 is an intercept.
Hereafter, we use vector notation: the observed points are denoted z i = (x i , y i ) ⊤ , and the true points are denoted ζ i = (ξ i , η i ) ⊤ .
The statistical structure is invariant with respect to transformation g if the change of the probability measure induced by the transformation of the sample can be obtained by some transformationg of parameters, that is, if there exists a bijectioñ g : Θ → Θ such that ∀θ ∈ Θ : P g(Z)|θ = P Z|g(θ) .
Here P g(Z)|θ is the induced probability measure; it is sometimes denoted P g(Z)|θ = P Z|θ g −1 .
The statistical structure is similarity invariant if it is invariant with respect to all similarity transformations of the form (34).
In order to become similarity invariant, the underlying statistical structure needs some extension. We assume that the true points lie on two lines, which may be parallel to the y-axis. The following restrictions do not ruin the invariance: • The true lines ℓ 1 and ℓ 2 intersect each other but do not coincide.
• The true points ζ 1 . . . , ζ n in the functional model or the set supp(P ζ ) where the true points are concentrated in the structural model can be covered with two lines uniquely. In the structural model, this means that the nonidentifiability conditions (13) and (14) do not hold.

These transformations areg
The estimator is called equivariant with respect to the transformation g if, when the data are transformed, the estimator follows the inducing transformation of parameters. The estimator ℓℓ(Z) for two lines and the estimatorσ 2 (Z) for error variance are equivariant with respect to similarity transformation g if The estimator is called similarity equivariant if it is equivariant with respect to any similarity transformation g.
In a fitting problem, an estimator for a "true figure" is called fitting equivariant with respect to transformation g(z), g : R → R if, when the sample is transformed, the estimated "true figure" follows the same transformation g. An estimator is called similarity fitting equivariant if it is fitting equivariant with respect to any similarity transformation.
In the two-line fitting problem, denote by ∪{ℓ 1 , ℓ 2 } = ℓ 1 ∪ ℓ 2 the union of a pair of two lines. An estimator ℓℓ(Z) is similarity fitting equivariant if and only if for any similarity transformation g(z), The similarity fitting equivariant estimator depends on geometry of the plane and does not depend on the Cartesian coordinate system used.
Because of (35), in the two-line fitting model, the estimator for two lines ℓℓ(Z) is similarity equivariant if and only if it is similarity fitting equivariant.

Similarity equivariance of the five estimators
Some troubles, which may arise during estimation, are not addressed yet.
• The estimation may fail with small positive probability. For example, the conic section estimated with the ALS2 estimator is an ellipse with some positive probability, and if it is, then the "ignore-F " estimator fails. (If the estimator is consistent, then the failure probability tends to 0 as n → ∞).
• The estimation may fail, for example, because the estimated line should be parallel to the y-axis, but the estimating procedure does not handle such case.
In order to define the equivariance of an unreliable estimator, we allow that the estimators fail simultaneously in both sides of (36), (37), or (38). Also, we allow that for fixed similarity transformation g(z), equation (36), (37), or (38) does not hold with probability 0.
The equivariance of the ALS2 estimator in the conic section fitting problem is verified in [11,Section 5.5] (see Theorem 30 there for similarity fitting equivariance). That implies the equivariance of the "ignore-F " estimator.
In order to make the updated before ignore-F step estimator equivariant, we use normalization of the ALS2 estimator (9) rather than β = 1.
The orthogonal regression estimator and the parametric maximum likelihood estimator are maximum likelihood estimators, but in different models. Thus, they are equivariant.
The criterion function for the RBAN-moment estimator is similarity invariant. This means that the criterion function does not change when the data sample follows a similarity transformation and the parameters follow the inducing transformation. Thus, the RBAN-moment estimator is equivariant.

An example of equivariant but not fitting equivariant estimator
Consider a further restriction of the mixture-of-two-normal-distributions model from Section 2.3. Assume that covariance matrices of Σ 1 and Σ 2 have the same diagonal entries but additive inverse off-diagonal entries: The statistical structure is invariant in scaling of the y-coordinate, (x new , y new ) = (x old , ry old ), r > 0. This transformation maps the lines y = −kx + h 1 and y = kx + h 2 onto the lines y = −rkx + rh 1 and y = rkx + rh 2 , respectively. The maximum likelihood estimator in this model is equivariant. However, this equivariance is somewhat strange. The transformation of parameters that induces the scaling of the y-coordinate of the observed points does not induce the same transformation of the true points nor the same mapping of the true lines. The estimated lines follow the transformation of parameters rather than the transformation of observed points. This is illustrated in Fig. 1.
Let k old be the true value of the parameter k before the transformation. Then after the transformation, the value of the parameter is with t = (r 2 k 2 old −1) σ 2 ξ old +(r 2 −1) σ 2 old . If 0 < r 2 = 1, k old = 0, and σ 2 > 0, then k new = rk old . Hence, the maximum likelihood estimator is not fitting equivariant with respect to scaling of the y-coordinate here.

Simulation setup
A sample of the true points (ξ i , η i ), i = 1, . . . , n, is generated from a random distribution concentrated on (a subset of) two lines. Three distributions of the true points are used; see Fig. 2: For the same sample of true points {(ξ i , η i ), i = 1, . . . , n}, 100 samples of the measurement errors {(δ i , ε i ), i = 1, . . . , n}, (δ i , ε i ) ⊤ ∼ N (0, σ 2 ), are simulated, and 100 samples of the observed points (x i , y i ) are obtained; see (2) and (3). For each sample of the observed points, the estimates of the parameters of the true lines were evaluated with the following five methods: two ALS2-based estimators (the ignore-F estimator and the estimator with one-step update of the ALS2 estimator before the ignore-F step), the orthogonal regression estimator, the parametric maximum likelihood estimator, and the RBAN moment estimator.
For each estimated couple of lines, the point of their intersection is found. The 100 estimates of intersection points are averaged, and their sample standard deviations are evaluated. For the ALS2-based estimators and the RBAN moment estimator, the standard errors of the estimators are also evaluated.

Notes on computation of particular estimators
For computation of the orthogonal regression estimator, the k-means method is used. Initially, two lines were chosen randomly. Then classification and mean steps are alternated. On the classification step, the observed points are split into two clusters based on which line is closer to the point. (The first cluster contains all the observed points that are closer to the first line than to the second line, and the second cluster contains the other observed points.) On the means step, each cluster is fitted with a straight line by the orthogonal regression method (the two lines are updated). The algorithm is completed when the classification step does now change the clusters. The obtained parameters of the two lines deliver a local minimum to the criterion function Q(k 1 , h 1 , k 2 , h 2 ) (24). Trying to obtain the global minimum, the algorithm is restarted several times with different initial two lines.

RBAN-moment estimator
In case the criterion function has multiple minima, a consistent estimator-that is, the "ignore-F " estimator-is used as the initial point, and the criterion function Q(θ) is searched for a local minimum nearby. Here θ = (k 1 , h 1 , k 2 , h 2 , σ 2 ) ⊤ is a vector meaning the parameters of interest.
The knowledge or misspecification of the parameter p does not affect the estimator for the parameters of interest k 1 , . . . , σ 2 . Thus, for estimation of the asymptotic covariance matrix, assume p = 0.5 to be known. The estimator of the asymptotic covariance matrix of (θ, M) is The estimator of the asymptotic covariance matrix of θ is the principal submatrix of Σ θ,M .

Simulation results
Average of estimated centers over 100 simulations, standard deviations over 100 simulations, and medians of estimated standard errors are presented in Tables 1-3.
Using the estimator F (by one-step update before ignore-F step), we improve the precision of estimation. The precision of the RBAN-moment estimator approximates the precision of the updated before ignore-F step estimator. The parametric maximum likelihood estimator is the best when the normality condition, which was assumed during construction of the estimator, is satisfied. Otherwise, it is biased.
The orthogonal regression and the maximum likelihood estimators are good for small error variance (σ 2 = 0.02 2 ). For σ 2 = 0.1 2 , the orthogonal regression estimator is broken down when the distribution of true points is a mixture of two normal distributions and is biased for the two other distributions of true points.
Mean-square deviance of the intersection of the estimated lines from the true intersection point is presented in Table 4.
For small errors, the RBAN-moment estimator is a bit less accurate than the updated before ignore-F step estimator. For σ 2 = 0.1 2 , the difference is negligible.
The parametric maximum likelihood estimator has the smallest deviation from the true value, except for the discrete distribution of true points and σ 2 = 0.01.
For small errors (σ 2 = 0.02 2 ), the orthogonal regression estimator outperforms the consistent estimators and has the deviation approximately as small as the parametric maximum likelihood estimator.
Normalization of the estimator of β affects the ALS2-based estimator of two lines with one-step update before the ignore-F step. With normalization β = 1, the derived estimator of two lines is not equivariant, whereas with normalization β ⊤ Ψ ′ n (σ 2 )β = −n, the derived estimator is equivariant. Comparison of equivariant and nonequivariant versions of the estimator is displayed in Table 5.
There is a tendency that the equivariant version of the estimator is more accurate for small samples than the nonequivariant version. The two versions of the estimator are consistent and asymptotically equivalent. When the estimation is precise, the difference between the versions is negligible. When the estimation is imprecise, it is impossible to make inference which version is more accurate.

Comparison of two estimators for asymptotic covariance matrix in the conic section fitting model
In [13] a conic section fitting model is considered, and two estimators ( Σ true and Σ sample ) for the asymptotic covariance matrix of the ALS2 estimator are constructed. The software developed here can be used to make numerical comparison of the estimates of the asymptotic covariance matrices. The data are generated as described in Section 4.1 with 1000 simulations for each set of true points. Thus, the true conic unnecessarily was chosen degenerate. For each simulation, the parameters of the conic section were estimated; its center is found, and two confidence ellipsoids for the center were constructed using two different estimators of the asymptotic covariance matrix.  The sample coverage probability and median (over 1000 ellipsoids) area of the confidence ellipsoids is presented in Table 6. The ellipsoids were constructed for confidence levels 0.8 and 0.95. The area of 95% confidence ellipsoids is displayed in   Table 6, and the area of 80% confidence ellipsoid is log 20 (5) = 0.5372 of the area of 95% confidence ellipsoids. Note that standard errors for coverage probability are 1.3% for 80% confidence ellipsoids and 0.7% for 95% confidence ellipsoids. The simulations do not allow us to make an inference which estimator is better. Thus, Σ sample -based estimator updated before ignore-F step is compared with other estimators in simulations in Section 4.4 because of simpler explicit expression for Σ sample .

A Proofs
Proof of Proposition 1. The strong consistency of the estimator follows from [11,Theorem 17]. Under the conditions of Proposition 1, 1 n Ψ n → Ψ ∞ (a.s. in the structural model).
Proof of Proposition 2. The strong consistency of β follows from (7) and (41). The proof of asymptotic normality and consistency of the estimator of the asymptotic covariance matrix can be obtained by modification of the proofs of Theorem 2 in [12] and Theorem 3 in [13].