Disordered Systems and Neural Networks
See recent articles
- [1] arXiv:2406.05335 [pdf, html, other]
-
Title: Critical Phase Transition in a Large Language ModelComments: 9 pages, 6 figuresSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG)
The performance of large language models (LLMs) strongly depends on the \textit{temperature} parameter. Empirically, at very low temperatures, LLMs generate sentences with clear repetitive structures, while at very high temperatures, generated sentences are often incomprehensible. In this study, using GPT-2, we numerically demonstrate that the difference between the two regimes is not just a smooth change but a phase transition with singular, divergent statistical quantities. Our extensive analysis shows that critical behaviors, such as a power-law decay of correlation in a text, emerge in the LLM at the transition temperature as well as in a natural language dataset. We also discuss that several statistical quantities characterizing the criticality should be useful to evaluate the performance of LLMs.
- [2] arXiv:2406.05448 [pdf, html, other]
-
Title: Reconsideration of optimization for reduction of traffic congestionComments: 2 pagesSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Quantum Physics (quant-ph)
One of the most impressive applications of a quantum annealer was optimizing a group of Volkswagen to reduce traffic congestion using a D-Wave system. A simple formulation of a quadratic term was proposed to reduce traffic congestion. This quadratic term was useful for determining the shortest routes among several candidates. The original formulation produced decreases in the total lengths of car tours and traffic congestion. In this study, we reformulated the cost function with the sole focus on reducing traffic congestion. We then found a unique cost function for expressing the quadratic function with a dead zone and an inequality constraint.
- [3] arXiv:2406.05842 [pdf, html, other]
-
Title: Replica symmetry breaking in spin glasses in the replica-free Keldysh formalismComments: 17 pages, 4 figuresSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Strongly Correlated Electrons (cond-mat.str-el); Mathematical Physics (math-ph); Quantum Physics (quant-ph)
At asymptotically late times ultrametricity can emerge from the persistent slow aging dynamics of the glass phase. We show that this suffices to recover the breaking of replica symmetry in mean-field spin glasses from the late time limit of the time evolution using the Keldysh path integral. This provides an alternative approach to replica symmetry breaking by connecting it rigorously to the dynamic formulation. Stationary spin glasses are thereby understood to spontaneously break thermal symmetry, or the Kubo-Martin-Schwinger relation of a state in global thermal equilibrium. We demonstrate our general statements for the spherical quantum $p$-spin model and the quantum Sherrington-Kirkpatrick model in the presence of transverse and longitudinal fields. In doing so, we also derive their dynamical Ginzburg-Landau effective Keldysh actions starting from microscopic quantum models.
- [4] arXiv:2406.06346 [pdf, html, other]
-
Title: Dynamical Mean-Field Theory of Complex Systems on Sparse Directed NetworksComments: 19 pages, 5 figuresSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Physics and Society (physics.soc-ph)
Although real-world complex systems typically interact through sparse and heterogeneous networks, analytic solutions of their dynamics are limited to models with all-to-all interactions. Here, we solve the dynamics of a broad range of nonlinear models of complex systems on sparse directed networks with a random structure. By generalizing dynamical mean-field theory to sparse systems, we derive an exact equation for the path-probability describing the effective dynamics of a single degree of freedom. Our general solution applies to key models in the study of neural networks, ecosystems, epidemic spreading, and synchronization. Using the population dynamics algorithm, we solve the path-probability equation to determine the phase diagram of a seminal neural network model in the sparse regime, showing that this model undergoes a transition from a fixed-point phase to chaos as a function of the network topology.
New submissions for Tuesday, 11 June 2024 (showing 4 of 4 entries )
- [5] arXiv:2406.05155 (cross-list from cond-mat.quant-gas) [pdf, html, other]
-
Title: Multifractality and hyperuniformity in quasicrystalline Bose-Hubbard models with and without disorderComments: 12 pages, 17 figuresSubjects: Quantum Gases (cond-mat.quant-gas); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Strongly Correlated Electrons (cond-mat.str-el)
Clarifying similarities and differences in physical properties between crystalline and quasicrystalline systems is one of central issues in studying quasicrystals. To contribute to this, we apply multifractal and hyperuniform analyses to nonuniform spatial patterns in the Bose-Hubbard model on the Penrose and Ammann-Beenker tilings. Based on the mean-field approximation, we obtain real-space distributions of boson density and bosonic condensate. In both Mott insulating and superfluid phases, the distributions are hyperuniform. Analyzing the order metric that quantifies the complexity of nonuniform spatial patterns, we find that both quasicrystals show a divergence of the order metric at a phase boundary between the Mott insulating and superfluid phases, in stark contrast to the case of a periodic square lattice. Our results suggest that hyperuniformity is a useful concept to differentiate between crystalline and quasicrystalline bosonic systems. Moreover, we introduce on-site random potentials into these quasicrystalline Bose-Hubbard models, leading to a Bose glass phase. Contrary to the Mott insulating and superfluid phases, we find that the Bose glass phase is multifractal. The same multifractality appears on a Bose glass phase in the periodic square lattice. Therefore, multifractality is common in a Bose glass phase irrespective of the periodicity of systems.
- [6] arXiv:2406.05377 (cross-list from cs.AR) [pdf, html, other]
-
Title: Highly Versatile FPGA-Implemented Cyber Coherent Ising MachineToru Aonishi, Tatsuya Nagasawa, Toshiyuki Koizumi, Mastiyage Don Sudeera Hasaranga Gunathilaka, Kazushi Mimura, Masato Okada, Satoshi Kako, Yoshihisa YamamotoComments: 20 pages, 9 figuresSubjects: Hardware Architecture (cs.AR); Disordered Systems and Neural Networks (cond-mat.dis-nn); Distributed, Parallel, and Cluster Computing (cs.DC); Emerging Technologies (cs.ET); Quantum Physics (quant-ph)
In recent years, quantum Ising machines have drawn a lot of attention, but due to physical implementation constraints, it has been difficult to achieve dense coupling, such as full coupling with sufficient spins to handle practical large-scale applications. Consequently, classically computable equations have been derived from quantum master equations for these quantum Ising machines. Parallel implementations of these algorithms using FPGAs have been used to rapidly find solutions to these problems on a scale that is difficult to achieve in physical systems. We have developed an FPGA implemented cyber coherent Ising machine (cyber CIM) that is much more versatile than previous implementations using FPGAs. Our architecture is versatile since it can be applied to the open-loop CIM, which was proposed when CIM research began, to the closed-loop CIM, which has been used recently, as well as to Jacobi successive over-relaxation method. By modifying the sequence control code for the calculation control module, other algorithms such as Simulated Bifurcation (SB) can also be implemented. Earlier research on large-scale FPGA implementations of SB and CIM used binary or ternary discrete values for connections, whereas the cyber CIM used FP32 values. Also, the cyber CIM utilized Zeeman terms that were represented as FP32, which were not present in other large-scale FPGA systems. Our implementation with continuous interaction realizes N=4096 on a single FPGA, comparable to the single-FPGA implementation of SB with binary interactions, with N=4096. The cyber CIM enables applications such as CDMA multi-user detector and L0 compressed sensing which were not possible with earlier FPGA systems, while enabling superior calculation speeds, more than ten times faster than a GPU implementation. The calculation speed can be further improved by increasing parallelism, such as through clustering.
- [7] arXiv:2406.05461 (cross-list from cond-mat.soft) [pdf, html, other]
-
Title: Pyroresistive response of percolating conductive polymer compositesComments: 9 pages with 7 figures: supplemental material: 7 pages with 6 figuresJournal-ref: Phys. Rev. Mater. 8, 045602 (2024)Subjects: Soft Condensed Matter (cond-mat.soft); Disordered Systems and Neural Networks (cond-mat.dis-nn); Materials Science (cond-mat.mtrl-sci)
The pyroresistive response of conductive polymer composites (CPCs) has attracted much interest because of its potential applications in many electronic devices requiring a significant responsiveness to changes in external physical parameters such as temperature or electric fields. Although extensive research has been conducted to study how the properties of the polymeric matrix and conductive fillers affect the positive temperature coefficient pyroresistive effect, the understanding of the microscopic mechanism governing such a phenomenon is still incomplete. In particular, to date, there is little body of theoretical research devoted to investigating the effect of the polymer thermal expansion on the electrical connectivity of the conductive phase. Here, we present the results of simulations of model CPCs in which rigid conductive fillers are dispersed in an insulating amorphous matrix. By employing a meshless algorithm to analyze the thermoelastic response of the system, we couple the computed strain field to the electrical connectedness of the percolating conductive particles. We show that the electrical conductivity responds to the local strains that are generated by the mismatch between the thermal expansion of the polymeric and conductive phases and that the conductor-insulator transition is caused by a sudden and global disconnection of the electrical contacts forming the percolating network.
- [8] arXiv:2406.05500 (cross-list from cond-mat.stat-mech) [pdf, html, other]
-
Title: Survival probability, particle imbalance, and their relationship in quadratic modelsSubjects: Statistical Mechanics (cond-mat.stat-mech); Disordered Systems and Neural Networks (cond-mat.dis-nn); Quantum Gases (cond-mat.quant-gas); Quantum Physics (quant-ph)
We argue that the dynamics of particle imbalance in quadratic fermionic models is, for the majority of initial many-body product states in site occupation basis, virtually indistinguishable from the dynamics of survival probabilities of single-particle states. We then generalize our statement to a similar relationship between the non-equal time and space density correlation functions in many-body states and the transition probabilities of single-particle states at nonzero distances. Finally, we study the equal time connected density-density correlation functions in many-body states, which exhibit certain qualitative analogies with the survival and transition probabilities of single-particle states. Our results are numerically tested for two paradigmatic models of single-particle localization: the 3D Anderson model and the 1D Aubry-André model. This work gives affirmative answer to the question whether it is possible to measure features of the single-particle survival and transition probabilities by the dynamics of observables in many-body states.
- [9] arXiv:2406.05865 (cross-list from quant-ph) [pdf, html, other]
-
Title: Information scrambling in quantum-walksComments: 8 pages, 6 figuresSubjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn)
We study information scrambling -- a spread of initially localized quantum information into the system's many degree of freedom -- in discrete-time quantum walks. We consider out-of-time-ordered correlators (OTOC) and K-complexity as probe of information scrambling. The OTOC for local spin operators in all directions has a light-cone structure which is ``shell-like''. As the wavefront passes, the OTOC approaches to zero in the long-time limit, showing no signature of scrambling. The introduction of spatial or temporal disorder changes the shape of the light-cone akin to localization of wavefuction. We formulate the K-complexity in system with discrete-time evolution, and show that it grows linearly in discrete-time quantum walk. The presence of disorder modifies this growth to sub-linear. Our study present interesting case to explore many-body phenomenon in discrete-time quantum walk using scrambling.
- [10] arXiv:2406.06193 (cross-list from cond-mat.stat-mech) [pdf, html, other]
-
Title: R\'enyi entanglement entropy of spin chain with Generative Neural NetworksComments: 10 pages, 7 figuresSubjects: Statistical Mechanics (cond-mat.stat-mech); Disordered Systems and Neural Networks (cond-mat.dis-nn); High Energy Physics - Lattice (hep-lat); Quantum Physics (quant-ph)
We describe a method to estimate Rényi entanglement entropy of a spin system, which is based on the replica trick and generative neural networks with explicit probability estimation. It can be extended to any spin system or lattice field theory. We demonstrate our method on a one-dimensional quantum Ising spin chain. As the generative model, we use a hierarchy of autoregressive networks, allowing us to simulate up to 32 spins. We calculate the second Rényi entropy and its derivative and cross-check our results with the numerical evaluation of entropy and results available in the literature.
- [11] arXiv:2406.06387 (cross-list from cond-mat.quant-gas) [pdf, html, other]
-
Title: Time-tronics: from temporal printed circuit board to quantum computerComments: 10 pages (including Methods), 3 figuresSubjects: Quantum Gases (cond-mat.quant-gas); Disordered Systems and Neural Networks (cond-mat.dis-nn); Mesoscale and Nanoscale Physics (cond-mat.mes-hall); Soft Condensed Matter (cond-mat.soft); Quantum Physics (quant-ph)
Time crystalline structures can be created in periodically driven systems. They are temporal lattices which can reveal different condensed matter behaviours ranging from Anderson localization in time to temporal analogues of many-body localization or topological insulators. However, the potential practical applications of time crystalline structures have yet to be explored. Here, we pave the way for time-tronics where temporal lattices are like printed circuit boards for realization of a broad range of quantum devices. The elements of these devices can correspond to structures of dimensions higher than three and can be arbitrarily connected and reconfigured at any moment. Moreover, our approach allows for the construction of a quantum computer, enabling quantum gate operations for all possible pairs of qubits. Our findings indicate that the limitations faced in building devices using conventional spatial crystals can be overcome by adopting crystalline structures in time.
- [12] arXiv:2406.06482 (cross-list from quant-ph) [pdf, html, other]
-
Title: Quantum Equilibrium Propagation for efficient training of quantum systems based on Onsager reciprocityComments: 10 pages, 3 figures; comments welcome!Subjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Emerging Technologies (cs.ET); Machine Learning (cs.LG)
The widespread adoption of machine learning and artificial intelligence in all branches of science and technology has created a need for energy-efficient, alternative hardware platforms. While such neuromorphic approaches have been proposed and realised for a wide range of platforms, physically extracting the gradients required for training remains challenging as generic approaches only exist in certain cases. Equilibrium propagation (EP) is such a procedure that has been introduced and applied to classical energy-based models which relax to an equilibrium. Here, we show a direct connection between EP and Onsager reciprocity and exploit this to derive a quantum version of EP. This can be used to optimize loss functions that depend on the expectation values of observables of an arbitrary quantum system. Specifically, we illustrate this new concept with supervised and unsupervised learning examples in which the input or the solvable task is of quantum mechanical nature, e.g., the recognition of quantum many-body ground states, quantum phase exploration, sensing and phase boundary exploration. We propose that in the future quantum EP may be used to solve tasks such as quantum phase discovery with a quantum simulator even for Hamiltonians which are numerically hard to simulate or even partially unknown. Our scheme is relevant for a variety of quantum simulation platforms such as ion chains, superconducting qubit arrays, neutral atom Rydberg tweezer arrays and strongly interacting atoms in optical lattices.
Cross submissions for Tuesday, 11 June 2024 (showing 8 of 8 entries )
- [13] arXiv:2401.07538 (replaced) [pdf, html, other]
-
Title: Evidence of Scaling Regimes in the Hopfield Dynamics of Whole Brain ModelSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Neural and Evolutionary Computing (cs.NE)
It is shown that a Hopfield recurrent neural network, informed by experimentally derived brain topology, recovers the scaling picture recently introduced by Deco et al., according to which the process of information transfer within the human brain shows spatially correlated patterns qualitatively similar to those displayed by turbulent flows, although with a more singular exponent, 1/2 instead of 2/3. Both models employ a coupling strength which decays exponentially with the euclidean distance between the nodes, but their mathematical nature is very different, Hopf oscillators versus a Hopfield neural network, respectively. Hence, their convergence for the same data parameters, suggests an intriguing robustness of the scaling picture. The present analysis shows that the Hopfield model brain remains functional by removing links above about five decay lengths, corresponding to about one sixth of the size of the global brain. This suggests that, in terms of connectivity decay length, the Hopfield brain functions in a sort of intermediate ``turbulent liquid''-like state, whose essential connections are the intermediate ones between the connectivity decay length and the global brain size. Finally, the scaling exponents are shown to be highly sensitive to the value of the decay length, as well as to number of brain parcels employed. As a result, any quantitative assessment regarding the specific nature of the scaling regime must be taken with great caution.
- [14] arXiv:2405.15220 (replaced) [pdf, html, other]
-
Title: Hybrid scaling properties of localization transition in a non-Hermitian disorder Aubry-Andr\'{e} modelSubjects: Disordered Systems and Neural Networks (cond-mat.dis-nn)
In this paper, we study the critical behaviors in the non-Hermtian disorder Aubry-André (DAA) model, and we assume the non-Hermiticity is introduced by the nonreciprocal hopping. We employ the localization length $\xi$, the inverse participation ratio ($\rm IPR$), and the real part of the energy gap between the first excited state and the ground state $\Delta E$ as the character quantities to describe the critical properties of the localization transition. By preforming the scaling analysis, the critical exponents of the non-Hermitian Anderson model and the non-Hermitian DAA model are obtained, and these critical exponents are different from their Hermitian counterparts, indicating the Hermitian and non-Hermitian Anderson and DAA models belong to different universe classes. The critical exponents of non-Hermitian DAA model are remarkably different from both the pure non-Hermitian AA model and the non-Hermitian Anderson model, showing that disorder is a independent relevant direction at the non-Hermitian AA model critical point. We further propose a hybrid scaling theory to describe the critical behavior in the overlapping critical region constituted by the critical regions of non-Hermitian DAA model and the non-Hermitian Anderson localziation.
- [15] arXiv:2401.09524 (replaced) [pdf, html, other]
-
Title: Size Winding Mechanism beyond Maximal ChaosComments: v2: 15 pages, 5 figures. v1: 6 pages, 4 figures, Supplemental Material: 4 pages, 1 figureSubjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Quantum Gases (cond-mat.quant-gas); Strongly Correlated Electrons (cond-mat.str-el); High Energy Physics - Theory (hep-th)
The concept of information scrambling elucidates the dispersion of local information in quantum many-body systems, offering insights into various physical phenomena such as wormhole teleportation. This phenomenon has spurred extensive theoretical and experimental investigations. Among these, the size-winding mechanism emerges as a valuable diagnostic tool for optimizing signal detection. In this work, we establish a computational framework for determining the winding size distribution in large-$N$ quantum systems with all-to-all interactions, utilizing the scramblon effective theory. We obtain the winding size distribution for the large-$q$ SYK model across the entire time domain. Notably, we unveil that the manifestation of size winding results from a universal phase factor in the scramblon propagator, highlighting the significance of the Lyapunov exponent. These findings contribute to a sharp and precise connection between operator dynamics and the phenomenon of wormhole teleportation.
- [16] arXiv:2402.05674 (replaced) [pdf, html, other]
-
Title: A High Dimensional Statistical Model for Adversarial Training: Geometry and Trade-OffsSubjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG)
This work investigates adversarial training in the context of margin-based linear classifiers in the high-dimensional regime where the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $\alpha = n / d$. We introduce a tractable mathematical model where the interplay between the data and adversarial attacker geometries can be studied, while capturing the core phenomenology observed in the adversarial robustness literature. Our main theoretical contribution is an exact asymptotic description of the sufficient statistics for the adversarial empirical risk minimiser, under generic convex and non-increasing losses. Our result allow us to precisely characterise which directions in the data are associated with a higher generalisation/robustness trade-off, as defined by a robustness and a usefulness metric. In particular, we unveil the existence of directions which can be defended without penalising accuracy. Finally, we show the advantage of defending non-robust features during training, identifying a uniform protection as an inherently effective defence mechanism.
- [17] arXiv:2402.07626 (replaced) [pdf, other]
-
Title: Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak FeaturesComments: Accepted to ICML 2024Subjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG)
We investigate the test risk of continuous-time stochastic gradient flow dynamics in learning theory. Using a path integral formulation we provide, in the regime of a small learning rate, a general formula for computing the difference between test risk curves of pure gradient and stochastic gradient flows. We apply the general theory to a simple model of weak features, which displays the double descent phenomenon, and explicitly compute the corrections brought about by the added stochastic term in the dynamics, as a function of time and model parameters. The analytical results are compared to simulations of discrete-time stochastic gradient descent and show good agreement.
- [18] arXiv:2402.13999 (replaced) [pdf, html, other]
-
Title: Asymptotics of Learning with Deep Structured (Random) FeaturesComments: ICML camera-ready versionSubjects: Machine Learning (stat.ML); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG); Statistics Theory (math.ST)
For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated by the problem of learning with Gaussian rainbow neural networks, namely deep non-linear fully-connected networks with random but structured weights, whose row-wise covariances are further allowed to depend on the weights of previous layers. For such networks we also derive a closed-form formula for the feature covariance in terms of the weight matrices. We further find that in some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
- [19] arXiv:2404.17829 (replaced) [pdf, html, other]
-
Title: Thermodynamic & pattern-matching efficiency in heterogeneous networksSubjects: Statistical Mechanics (cond-mat.stat-mech); Disordered Systems and Neural Networks (cond-mat.dis-nn)
Why complex structures emerge in a real world environment is a fascinating question which has not found any fully satisfactory answer at the point of today. At the border between statistical physics, machine learning and network theory, we provide some computational empirical evidence indicating that the emergence of complex structures in the outer world is the macroscopic thermodynamic result of the optimal trade-off between the ability to encode information about the environment in the very features of the network and the ability to transmit information across it. While the former is quantified by the von Neumann network entropy and the latter by the Helmholtz free energy, the two are combined into a single quantity tantamount to the thermodynamic efficiency. We take as a case study an algorithm originally proposed for creating and searching trees of patterns under the assumption that, akin to living system, the algorithm seeks the best internal representation of the outer world, namely a real world dataset of patterns. Remarkably, the optimal thermodynamic efficiency overlaps with the optimal pattern-matching efficiency, defined in terms of accuracy and computational cost, whether the former is measured at the resolution scale where the network transits from exhibiting a specific heat with a single peak to a scale-invariant region. Moreover the critical hypothesis is investigated and empirical evidence suggests that the system benefits from the vicinity to the critical point whether it is faced with higher-complexity environments, while it does not when the environment is simpler.