Tall arrays can be used in a variety of ways-from a single system running an application that has access to large amounts of data (that will not fit into the host memory) to a Hadoop system with many processors (that have access to HDFS).
This feature is still available, but the challenge is designing the application and distributing the data.
This allowed developers to take advantage of a cluster running a MATLAB application on multiple machines with in-memory data distributed among the machines. MATLAB R2006b introduced the concept of distributed arrays. They are not limited by the amount of memory on the host computer. Tall arrays allow MATLAB users to access data from a number of sources, ranging from databases to a Hadoop Distributed File System (HDFS). The latest iteration, MATLAB R2016b, incorporates a new data type: tall arrays (see figure).
The Mathworks’ MATLAB R2016a had a number of improvements, including Live Editing and big data analytics. MATLAB R2016b tall arrays can provide data from sources that range from a database to a Hadoop Distributed File System (HDFS).