Developing Linear Algebra Codes on Modern Processors