Sometimes, you can reorient your problem such that a matrix or tensor has only diagonal entries, making it easier to handle.
For example by choosing new unit vectors or by changing the set of functions describing the problem (whatever is the thing that the matrix or tensor is tied to).
in practice, a lot of problems/equations can be approximated with so-called “linear equations”. a linear equation is something that can involve any number of variables, but no variable has any exponent other than 1. for example, 2x + 6y + 5z = 8 is a linear equation. (this is a generalization of the case when there are two variables, and you have an equation like 2x + 3y = 9. in this case, you can rearrange this equation into slope intercept form and get the standard equation of a line, hence the name linear.)
again, in practice, it’s very common to deal with multiple linear equations at the same time. say, 2x + 3y = 9, and x + y = 0. to solve such all equations simultaneously (in the 2 variable case) means finding x and y that satisfy both equations. and in the 2 variable case, it basically means finding a point where two lines intersect (if possible).
you can do some kind of advanced math to show that linear systems of equations correspond to matrices. this is “nice” because matrices are extremely easy for computers to deal with, and we also have a lot of theorems that talk about how matrices behave.
so, to summarize, we’ve reduced a real world problem into something involving matrices, with the hope of maybe having a computer solve it. in practice, many matrices can be “diagonalized”, which basically means you can factor it as a product of matrices satisfying some certain conditions, but i’m glossing over those details because it can be messy if you don’t know much linear algebra. you can think of it as kind of like factorizing a number into primes. it isn’t really the same thing, but it can be a helpful analogy maybe. (primes are easier to work with, and sometimes it’s helpful to view any number as just a bunch of primes multiplied together.)
the main advantage of diagonal matrices is that they’re very easy to work with (compared to matrices, which are already “nice” to work with). in practice, this is important because a lot of formulas, algorithms, etc only work for (or are most efficient on) diagonal matrices.
i hope this helps, it’s a bit hard to get into the details without making things needlessly complicated (a common problem for things involving matrices), but i tried to do my best to focus on the underlying concepts/real world use cases.