Extended Characteristic Polynomial

Extended Characteristic Polynomial: Comparison

Please note this is a comparison between Version 2 by Lorentz Jäntschi and Version 1 by Lorentz Jäntschi.

In the context of molecular topology it is well known the Characteristic Polynomial (ChP) for its important uses in relating the structure with molecular properties of hydrocarbons, as well as a high resolution discriminant for chemical structures. Also it is well known that, as any other topological based descriptor/function, the ChP is blind to the nature of the chemical element and of the chemical bond, both being treated as indistinguishable in the context of the chemical graphs theory. An extension of the Characteristic Polynomial (EChP) is proposed by relaxing the identity matrix and adjacency matrix to contain non-binary based values, by in the same time keeping the meaning of those matrices (identity as collecting information regarding the identity of the atoms in the molecule and adjacency as collecting information regarding the connectivity or bonds in the molecule) and their feature (of being symmetrical).

characteristic polynomial
molecular descriptors
structure-property relationships

Extending the Characteristic Polynomial

The characteristic polynomial was firstly encountered (at that time so called 'secular equation') in conjunction with the movement of the planets by famous mathematicians: Euler (1743)^[1], D’Alembert (1750)^[2], Lagrange (1773)^[3], Laplace (1775)^[4], its theory and use was gradually refined by others: Fourier (1822)^[5], Cauchy (1829)^[6], Sylvester (1852)^[7], Hermite (1857)^[8], Weierstrass (1867)^[9], Jordan (1872)^[10], and Kronecker (Kronecker, 1890)^[11].

Once the more general problem were identified as standing for any Hessian matrix by Sylvester (1880)^[12] the characteristic polynomial found its use in the approximate treatment of the Schrödinger's equation (1926)^[13] for the wavefunction (Hartree, 1928)^[14]^,^[15] and Fock (Fock, 1930)^[16]^,^[17] finding the same eigenvector-eigenvalue problem as in the Slater's treatment (1929)^[18], revised later by Hartree & Hartree (1935)^[19].

While the first reports relating to the use of the characteristic polynomial in relation with the chemical structure appears shortly after the discovery of wave-based treatment of microscopic level from Hückel (1931)^[20], the roots of this extended version of the characteristic polynomial can be attributed to the assignment of the individual electronic energies (ε_i), as accounted for the first time by Coulson in 1937^[21], 1940^[22], and 1950^[23].

The characteristic polynomial (ChP) is the natural construction of a polynomial in which the eigenvalues of the [Ad] are the roots of the ChP as it follows:

λ is an eigenvalue of [Ad] ↔ it exists [v] ≠ 0 eigenvector such that λ·[v] = [Ad]·[v] →

(λ·[Id]-[Ad])·[v] = 0; since v ≠ 0 → [λ·Id - Ad] is singular → det([λ·Id - Ad]) = 0

The characteristic polynomial ChP is a polynomial in λ of degree the number of atoms, ChP(λ) = det([λ·Id - Ad]).

As introduced in (Joița & Jäntschi, 2017)^[24], the Extended Characteristic Polynomial (EChP) of a chemical structure is defined similarly (see Table 1) with the Characteristic Polynomial (ChP) of its associated topology (T).

Table 1. Topology, Chemistry and Geometry for Ethanimine (HN=CH-CH₃)

Legend for replacement values:

a_C = 12/294; a_N = 14/294; b_C = 3915/3915; b_N = 77.355/3915; d_C = 2267/30000; d_N = 1495/30000; e_C = 2.55/4.00; e_N = 3.04/4.00; f_C = 1086.2/1312.0; f_N = 1402/1312; g_C = 3915/3915; g_N = 63.15/3915; i₁ = -0.738; i₂ = 0.497; i₃ = -0.756; j₁ = -0.593; j₂ = 0.090; j₃ = -0.653; k₁ = -0.628; k₂ = 0.097; k₃ = -0.734; l₁ = 0.0; l₂ = 0.0; l₃ = 0.0; g₁₂ = 1/1.290; g₁₃ = 1/2.451; g₂₃ = 1/1.496; b₁₂ = 1/1.869; b₂₃ = 1/0.905; B₁₃ = 1/(1/1.869+1/0.905).

Extended Characteristic Polynomial

A natural extension of ChP is to store in the identity matrix (instead of unity) non-unity values accounting for different atoms (and then the [Id] matrix is replaced by a generic matrix, [I_p], having more than one alternative to be filled with values), as well as to store in the adjacency matrix (instead of unity) non-unity values accounting for different bonds (and then the [Ad] matrix is replaced by a generic matrix, [C_q], having more than one alternative to be filled with values). Then the extended characteristic polynomial is defined by EChP(λ; p, q) = det([λ·I_p - C_q]).

The proposed alternatives for identifying the atoms are p Î ∈ {"A", "B", "C", "D", "E", "F", "G", "H" "I", "J", "K", "L"}, being exemplified in Table 1 and having the following meanings: Atomic mass ("A"), Boiling point ("B"), Count ("C"), Density ("D"), Electronegativity ("E"), First ionization potential ("F"), Melting point ("G"), Hydrogen connections ("H"), Electrostatic charge ("I"), Mulliken charge ("J"), Natural charge ("K"), and Spin ("L").

The proposed alternatives for identifying the bonds are q ∈ Î {"t", "g", "c", "b", "T", "G", "C", "B"}, being exemplified in Table 1 and having the following meanings: being derived of adjacencies ({"t", "g", "c", "b"}) and on distances {"T", "G", "C", "B"}, as classical topological measures ("t" and "T"), inverses of geometrical distances ("g" and "G"), inverses of conventional bond orders ("c" and "C"), and inverses of Mulliken bond orders ("b" and "B").

It should be noted that all proposed alternatives keep the symmetry alive in both identity (I_p) and connectivity (C_q) replacements of identity matrix ([Id]) and adjacency matrix ([Ad]), keeping thus the Hessian property of their linear combination.

Also it should be noted that any value greater than 1 as entry in any of identity (I_p) and connectivity (C_q) matrices is dangerous to the computations of the associated characteristic polynomial (any number greater than 1 easily diverges producing huge numbers when is risen at a natural power), and as such, proportion (from 0 to 1, or from -1 to 1) scales must be used whenever is possible to keep at reasonably low level the evaluations for the extended version of the characteristic polynomial.

A series of constant values has been used for a group of atomic properties ({"A", "B", "D", "E", "F", "G", "H"}) by keeping the track between chemical elements through their atomic number (Z) as listed in Table 2.

Table 2. List of the atomic numbers associating atomic properties with chemical elements

In the first element of the list (line 0 column 0 in Table 2) it has been kept the scale of the atomic property - the value to be used to divide all other values when are to be used as atomic property descriptors.

Following tables (from Table 3 to Table 9) lists the constants used for each chemical element.

Table 3. Atomic masses

Table 4. Boiling points

Table 5. Densities

Table 6. Electronegativities

Table 7. First ionization potentials

Table 8. Melting points

Table 9. Valences (to be used for Hydrogen connections)

The valences (Table 9) are to be used to calculate the number of attached Hydrogen atoms based on the bonds and their orders (by subtracting the sum of the bond orders form the valence), while the counts ("C" property) is trivial - always 1. For the remaining atomic properties ({"I", "J", "K", "L"}), those are molecule-dependent and are expected to be provided from energy calculations on the molecule. As can be observed excepting Hydrogen connections ("H", valences listed in Table 9) the rest of the atomic properties ({"A", "B", "C", "D", "E", "F", "G"}) are molecule-independent.

As exemplified also in Table 1, the connectivities are to be calculated as follows:

"t" as adjacency matrix ([Ad]) provides;
"T" as topological distance matrix ([Di]) provides, inversed ([C_T]_i,j = 1/[Di]_i,j for i ≠ j, 0 otherwise);
"G" as geometrical distance matrix ([Dxyz]) provides, inversed ([C_G]_i,j = 1/[Dxyz]_i,j for i ≠ j, 0 otherwise);
"g" as dot product of geometrical distances ([Dxyz]) and adjacency ([Ad]) matrices ([C_g]_i,j = [Dxyz]_i,j[Ad]_i,j);
"C" and "B" construct first an replacement for adjacency matrix by replacing the values of 1 with the inverse of the bond order ("C" as classical bond orders and "B" as Mulliken bond orders) and calculate a distance matrix ([DC], [DB]) by using this new adjacency matrix, finally inversing those values ([C_C]_i,j = 1/[DC]_i,j for i ≠ j, 0 otherwise; [C_B]_i,j = 1/[DB]_i,j for i ≠ j, 0 otherwise);
"c" and "b" as dot products of previously calculated [DC] and [DB] matrices with adjacency ([C_c]_i,j = [C_C]_i,j[Ad]_i,j; [C_b]_i,j = [C_B]_i,j·[Ad]_i,j).

Parameterized by the atomic property (p ∈ Î {"A", "B", "C", "D", "E", "F", "G", "H" "I", "J", "K", "L"}) and by the connectivity (q Î ∈ {"t", "g", "c", "b", "T", "G", "C", "B"}) the extended characteristic polynomial become a family of polynomial functions (EChP(λ; p,q)).

Practical use of the Characteristic Polynomial extension

It has previously shown in (Bolboacă & Jäntschi)^[25] how the classical characteristic polynomial can be used to link the chemical structure with measured properties and activities of the molecules build up from hydrocarbons.

In the same manner the EChP(λ; p, q) may serve to link the chemical structure with measured properties and activities of the molecules build up from any chemical elements.

One strategy is to evaluate the EChP(λ; p, q) polynomial in a series of evenly spaced points from -1 to 1. For three digits rational arguments (λ = -1.000, -0.999, …, -0.001, 0.000, 0.001, …, 0.999, 1.000) are 2001 evaluation points.

By multiplying with the number of alternatives from identity (I_p) and connectivity (C_q) choices (p and q), the number of individuals in the EChP family is 2001·12·8 = 192096. Three linearization operations can be applied ("I", "R", "L") through the functions f_I(x) = x, f_R(x) = 1/x and f_L(x) = ln(x) increasing the number of individuals in the EChP family to 576288. To keep the track of each individual, the naming convention is: L₁L₂L₃C₀D₁D₂D₃ where L₁ ∈ Î {"I", "R", "L"} for linearization, L₂ ∈ Î {"A", "B", "C", "D", "E", "F", "G", "H" "I", "J", "K", "L"} for identity, and L₃ Î ∈ {"t", "g", "c", "b", "T", "G", "C", "B"} for connectivity with the correspondences above given, while C is "N" for λ < 0, "0" for λ = 0, and "P" for λ > 0, and D₁D₂D₃ is the group of (3) digits from the evaluating value of λ (varying from "000" to "999").

In (Joița & Jäntschi, 2017)^[24] paper is given an application of the use of EChP for a series of C₂₀ fullerene congeners obtained by replacing in a patterned manner carbon atoms with nitrogen and boron. The obtained results revealed good EChP explanatory capabilities of the structure-property relationship for the area (r²_adj = 0.994) and for the volume (r²_adj = 0.946) of the series of 46 C₂₀ fullerene congeners.

It should be noted that the constants give in Tables 3 to 9 to be used as scaling factors for the atomic properties (as opposite to the atomic properties itself, which were taken from measured or published data) are more or less arbitrary and can be subject to debate, but the principle of constructing EChP is sound. Further studies are necessary to construct a best fit for the scaling factors (eventually made by parameterizing them).

References

Euler, L.. De integratione aequationum differentialium altiorum gradurn; Miscellanea Berolinensia 7 = Opera omnia (1) 22 (1936): Basel, 1743; pp. 108-213.
D’Alembert, J.. Suite des recherches sur le calcul intégrale; Histoire de l'acadénie des sciences (1750): Berlin, 1748; pp. 249-291.
Lagrange, J.-L.. Sur L’équation Séculaire de la Lune; Mémoires de l’Acadéémie Royale des Science (1774): Paris, 1773; pp. 335-399.
Laplace, P. S.. Mémoire sur les solutions particulières des équations différentielles et sur les inégalités séculaires des planètes; Mémoires de l’Acadéémie Royale des Science (1775): Paris, 1775; pp. 325-366.
Fourier, J. B. J.. Théorie analytique de la chaleur; Chez Firmin Didot & Perè et fils: Paris, 1822; pp. 684.
Cauchy, A.-L.. Sur l’équation à l’aide de laquelle on détermine les inégalités séculaires des mouvements des planètes. Exercices de mathématiques.; Rpt. in Oeuvres complètes, série 2, tome 9, 4: Paris, 1829; pp. 174-195.
Sylvester, J.J.; Sur une propriété nouvelle de l’équation qui sert à déterminer les inégalités séculaires des planètes. Nouvelles annales de mathématiques, journal des candidats aux écoles polytechnique et normale 1852, 1(11), 434-440, N.A..
Hermite, C.; Extrait d'une lettre à M. Borchardt sur l'invariabilité du nombre des carrés positifs et des carrés négatifs dans la transformation des polynômes homogènes du second degré. Journal für die reine und angewandte Mathematik 1857, 53, 271-274, N.A..
Weierstrass, K.. Zur theorie der bilinearen und quadratischen formen; Monatsh. Akad. Wiss.: Berlin, 1867; pp. 310-338.
Jordan, C.; Sur les oscillations infiniment petites des systèmes matériels. Comptes rendus de l’Académie des sciences de Paris 1872, 74, 1395-1399, N.A..
Kronecker, L.. Algebraische reduction der schaaren bilinearer formen; S.-B. Akad.: Berlin, 1890; pp. 763-776.
Sylvester, J.J.; On the theorem connected with Newton's rule for the discovery of imaginary roots of equations. Messenger of Mathematics 1880, 9, 71-84, N.A..
Schrödinger, E.; An undulatory theory of the mechanics of atoms and molecules. Physical Review 1926, 28(6), 1049-1070, 10.1103/PhysRev.28.1049.
D. R. Hartree; The Wave Mechanics of an Atom with a Non-Coulomb Central Field. Part I. Theory and Methods. Mathematical Proceedings of the Cambridge Philosophical Society 1928, 24(1), 89-110, 10.1017/s0305004100011919.
D. R. Hartree; The Wave Mechanics of an Atom with a Non-Coulomb Central Field. Part II. Some Results and Discussion. Mathematical Proceedings of the Cambridge Philosophical Society 1928, 24(1), 111-132, 10.1017/s0305004100011920.
V. Fock; Näherungsmethode zur Lösung des quantenmechanischen Mehrkörperproblems. The European Physical Journal A 1930, 61(1-2), 126-148, 10.1007/bf01340294.
V. Fock; "Selfconsistent field" mit Austausch für Natrium. The European Physical Journal A 1930, 62(11-12), 795-805, 10.1007/bf01330439.
J. C. Slater; The Theory of Complex Spectra. Physical Review 1929, 34(10), 1293-1295, 10.1103/PhysRev.34.1293.
D. R. Hartree; W. Hartree; Self-consistent field, with exchange, for beryllium. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences 1935, 150(869), 9-33, 10.1098/rspa.1935.0085.
Erich Hückel; Quantentheoretische Beiträge zum Benzolproblem. The European Physical Journal A 1931, 70, 204-286, 10.1007/bf01339530.
C. A. Coulson; The evaluation of certain integrals occurring in studies of molecular structure. Mathematical Proceedings of the Cambridge Philosophical Society 1937, 33(1), 104-110, 10.1017/s0305004100016820.
C. A. Coulson; On the calculation of the energy in unsaturated hydrocarbon molecules. Mathematical Proceedings of the Cambridge Philosophical Society 1940, 36(2), 201-203, 10.1017/s0305004100017175.
C. A. Coulson; Notes on the secular determinant in molecular orbital theory. Mathematical Proceedings of the Cambridge Philosophical Society 1950, 46(1), 202-205, 10.1017/s0305004100025639.
Dan-Marian Joiţa; Lorentz Jäntschi; Extending the Characteristic Polynomial for Characterization of C20 Fullerene Congeners. Mathematics 2017, 5(4), 84(12), 10.3390/math5040084.
Sorana Daniela Bolboaca; Lorentz Jantschi; How Good Can the Characteristic Polynomial Be for Correlations?. International Journal of Molecular Sciences 2007, 8(4), 335-345, 10.3390/i8040335.