Ever so slightly closes price gap between high capacity SSDs and HDDs
The 15.36TB drive, which is a smidgen smaller in capacity than the largest hard disk drive currently on the market (a 16TB Toshiba HDD model), costs “only” €2.474,78 plus sales tax or around $2,770 (about £2,140).
While that is far more expensive than smaller capacity SSDs (Silicon Power’s 1TB SSD retails for under $95 at Amazon), it is less than half the average price of competing enterprise SSDs like the Seagate Nytro 3330, the Western Digital Ultrastar DC SS530, the Toshiba PM5-R or the Samsung SSD PM1633a.
HDD still wins the price/capacity comparison
And just for the comparison, a 14TB hard disk drive, the MG07 from Toshiba, retails for around $440, about a sixth of the price, which gives you an idea of the price gulf between the two. If you are looking for something bigger, then the Samsung SSD PM1643 is probably your only bet at €7294.22 excluding VAT.
Bear in mind that these are 2.5-inch models which are far smaller than 3.5-inch hard disk drives. They also connect to the host computer using a special connector called SAS (Serial Attached Small Computer System Interface). The Micron 9300 Pro connects via the U.2 PCIe (NVMe), offering read speeds of up to 3.5GBps.
For the ultimate data hoarder, there’s the Nimbusdata Exadrive which boasts a capacity of 100TB albeit in a 3.5-inch form factor.
Scientists find surprising way to affect information storage properties in metal alloy.
Sometimes scientific discoveries can be found along well-trodden paths. That proved the case for a cobalt-iron alloy material commonly found in hard disk drives.
As reported in a recent issue of Physical Review Letters, researchers from the U.S. Department of Energy’s (DOE) Argonne National Laboratory, along with Oakland University in Michigan and Fudan University in China, have found a surprising quantum effect in this alloy.
The effect involves the ability to control the direction of electron spin, and it could allow scientists to develop more powerful and energy-efficient materials for information storage. By changing the electron spin direction in a material, the researchers were able to alter its magnetic state. This greater control of magnetization allows more information to be stored and retrieved in a smaller space. Greater control could also yield additional applications, such as more energy-efficient electric motors, generators and magnetic bearings.
The effect the researchers discovered has to do with “damping,” in which the direction of electron spin controls how the material dissipates energy. “When you drive your car down a flat highway with no wind, the dissipating energy from drag is the same regardless of the direction you travel,” said Argonne materials scientist Olle Heinonen, an author of the study. “With the effect we discovered, it’s like your car experiences more drag if you’re traveling north-south than if you’re traveling east-west.”
“In technical terms, we discovered a sizable effect from magnetic damping in nanoscale layers of cobalt-iron alloy coated on one side of a magnesium oxide substrate,” added Argonne materials scientist Axel Hoffmann, another author of the study. “By controlling the electron spin, magnetic damping dictates the rate of energy dissipation, controlling aspects of the magnetization.”
The team’s discovery proved especially surprising because the cobalt-iron alloy had been widely used in applications such as magnetic hard drives for many decades, and its properties have been thoroughly investigated. It was conventional wisdom that this material did not have a preferred direction for electron spin and thus magnetization.
In the past, however, scientists prepared the alloy for use by “baking” it at high temperature, which orders the arrangement of the cobalt and iron atoms in a regular lattice, eliminating the directional effect. The team observed the effect by examining unbaked cobalt-iron alloys, in which cobalt and iron atoms can randomly occupy each other’s sites.
The team was also able to explain the underlying physics. In a crystal structure, atoms normally sit at perfectly regular intervals in a symmetric arrangement. In the crystal structure of certain alloys, there are slight differences in the separation between atoms that can be removed through the baking process; these differences remain in an “unbaked” material.
Squeezing such a material at the atomic level further changes the separation of the atoms, resulting in different interactions between atomic spins in the crystalline environment. This difference explains how the damping effect on magnetization is large in some directions, and small in others.
The result is that very small distortions in the atomic arrangement within the crystalline structure of cobalt-iron alloy have giant implications for the damping effect. The team ran calculations at the Argonne Leadership Computing Facility, a DOE Office of Science User Facility, that confirmed their experimental observations.
The researchers’ work appears in the March 21 online edition of Physical Review Lettersand is entitled, “Giant anisotropy of Gilbert damping in epitaxial CoFe films.”
If you ask Adrian Macneil, not especially. And he should know — he’s the director of engineering at Cruise, the self-driving startup that General Motors acquired for nearly $1 billion in 2016. “I think the best way that I’ve heard this described [is], the entire industry is basically in a race to the starting line,” Macneil told VentureBeat in a phone interview. “The penetration of driving the majority of miles with autonomous miles isn’t going to happen overnight.”
Cruise is considered a pack leader in a global market that’s anticipated to hit revenue of $173.15 billion by 2023. Although it hasn’t yet launched a driverless taxi service (unlike competitors Waymo, Yandex, and Drive.ai) or sold cars to customers, it’s driven more miles than most — around 450,000 in California last year, according to a report it filed with the state’s Department of Motor Vehicles. That’s behind only Waymo, which drove 1.2 million miles. Moreover, it’s repeatedly promised to launch a commercial service this year that would feature as many as 2,600 driverless cars without steering wheels, brake pedals, and accelerators.
But it’s been a long and winding path for Cruise since its humble beginnings five years ago, to put it mildly. To get a sense of how far Cruise has come and where it’s going, we spoke with Macneil about Cruise’s ongoing efforts to train cars synthetically, why the company is targeting San Francisco as one of several potential launch cities, and how Cruise fits into the broader self-driving landscape.
Cruise Automation chief technology officer Kyle Vogt — who held the role of CEO until January, when former GM president Dan Ammann took over — cofounded Cruise with Dan Kan in 2013. Vogt, an MIT computer science graduate and a founding employee of Justin.tv (which became Twitch), started a number of companies prior to Cruise, including Socialcam, a mobile social video app that was acquired by Autodesk for $60 million in 2012. (Amazon purchased Twitch in 2016 for $970 million.)
Vogt can trace his passion for robotics back to childhood. By age 14, he built a Power Wheels car that could drive using computer vision. And while an undergraduate at MIT, he competed with a team in the 2004 Defense Advanced Research Projects Agency (DARPA) Grand Challenge, a $1 million competition to develop a car that could autonomously navigate a route from Barstow, California to Primm, Nevada.
Above: GM: Fourth generation vehicle, the Cruise AV.
Roughly a year after Cruise joined Y Combinator, Vogt teamed up with Dan Kan — the younger brother of Justin.tv’s Justin Kan — and it wasn’t long before they and a small team of engineers had a prototype: the RP-1. The $10,000 direct-to-consumer aftermarket kit retrofitted the Audi A4 and S4 with highway self-driving features (much like the open source stack developed by George Hotz’s Comma.ai), with the goal of supporting additional vehicles down the line.
But at a certain point, they decided to pivot toward building a more ambitious platform that could conquer city driving. Cruise announced in January 2014 that it would abandon the RP-1 in favor of a system built on top of the Nissan Leaf, and in June 2015, it received a permit to test its tech from the California Department of Motor Vehicles.
GM acquired Cruise shortly afterward, in March 2016. Back then, Cruise had roughly 40 employees, a number that quickly ballooned to 100. Cruise had 200 as of June 2017, and it plans to hire over 2,000 new workers — double its current workforce — by 2021.
Growth hasn’t slowed in the intervening months. In May 2018, Cruise — which remains an independent division within GM — announced that SoftBank’s Vision Fund would invest $2.25 billion in the company, along with another $1.1 billion from GM itself. And in October 2018, Honda pledged $750 million, to be followed by another $2 billion in the next 12 years. Today, Cruise has an estimated valuation of $14.6 billion, and the company recently expanded to a larger office in San Francisco and committed to opening an engineering hub in Seattle.
Along the way, Cruise acquired Zippy.ai, a startup developing autonomous robots for last-mile grocery and package delivery, and more recently snatched up Strobe, a provider of “chip-scale” lidar technology. Cruise says that the latter’s hardware will enable it to reduce the cost of each sensor on its self-driving cars by 99%.
Cruise runs lots of simulations across its suite of internal tools — about 200,000 hours of compute jobs each day in Google Cloud Platform (25 times the number of hours 12 months ago) — one of which is an end-to-end, three-dimensional Unreal Engine environment that Cruise employees call “The Matrix.” Macneil says it enables engineers to build any kind of situation they’re able to dream up, and to synthesize sensor inputs like camera footage and radar feeds to autonomous virtual cars.
According to Macneil, Cruise spins up 30,000 instances daily across over 300,000 processor cores and 5,000 graphics cards, each of which loops through a single drive’s worth of scenarios and generates 300 terabytes of results. It’s basically like having 30,000 virtual cars driving around in parallel, he explained, and it’s a bit like Waymo’s Carcraft and the browser-based framework used by Uber’s Advanced Technology Group.
Above: A screenshot taken from within “The Matrix,” Cruise’s end-to-end simulation tool.Image Credit: GM
“[The Matrix] is really good for understanding how the entire car behaves [and] also how it behaves in situations that we would not encounter frequently in the real world,” said Macneil. “So if we want to find out what happens, say, if a small object jumps in front of a car or something, we can create those kinds of simulations and reliably reproduce them. If every time you have a software release you deploy to the car and then go out and drive 100,000 or 1,000,000 miles, you’re going to be waiting quite a long time for feedback.”
Another testing approach Cruise employs is replay, which involves extracting real-world sensor data, playing it back against the car’s software, and comparing the performance with human-labeled ground truth data. Yet another is a planning simulation, which lets Cruise create up to hundreds of thousands of variations of a scenario by tweaking variables like the speed of oncoming cars and the space between them.
“We understand how, for example, if we take an updated version of the codebase and play back a construction zone, we can actually compare the results … We can really drill down to a really deep level and understand what our car’s behavior will be,” Macneil said. “If we take something like an unprotected left turn, which is a pretty complicated situation … we can [see how changes] affect how quickly our cars are able to identify [gaps between cars] and whether they choose to take that gap or not.”
Cruise doesn’t measure the number of simulated miles it’s driven, and that’s a conscious decision — Macneil says they prefer to place emphasis on the “quality” of miles rather than the total. “We think more about how the tests that are running hundreds of times a day [are covering a] range of scenarios,” he said. “It’s about more than just racking up a lot of miles — it’s about the exposure to different environments that you’re getting from those miles.”
But while its training data remains closely guarded, some of Cruise’s libraries and tools have begun to trickle into open source. In February, it released Worldview, a graphics stack of 2D and 3D scenes with accompanying mouse and movement controls, click interaction, and a suite of built-in commands. In the coming weeks, it will publish a full-featured visualization tool that’ll allow developers to drill into real-world and simulation data to better understand how autonomous systems — whether cars or robots — respond in certain situations.
In the real world, Cruise uses third-generation Chevrolet Bolt all-electric cars equipped with lidar sensors from Velodyne, as well as short- and long-range radar sensors, articulating radars, video cameras, fault-tolerant electrical and actuation systems, and computers running proprietary control algorithms engineered by Cruise. They also sport in-vehicle displays that show information about upcoming turns, merges, traffic light status, and other information, as well as brief explanations of pauses. Most are assembled in a billion-dollar Lake Orion, Michigan plant (in which GM further invested $300 million last month) that’s staffed by 1,000 people and hundreds of robots.
Cruise is testing them in Scottsdale, Arizona and the metropolitan Detroit area, with the bulk of deployment concentrated in San Francisco. It’s scaled up rapidly, growing its starting fleet of 30 driverless vehicles to about 130 by June 2017. Cruise hasn’t disclosed the exact total publicly, but the company has 180 self-driving cars registered with California’s DMV, and three years ago, documents obtained by IEEE Spectrum suggested the company planned to deploy as many as 300 test cars around the country.
Above: A replay test performed with GM’s Worldview platform.Image Credit: GM
Currently, Cruise operates an employees-only ride-hailing program in San Francisco called Cruise Anywhere that allows the lucky few who make it beyond the waitlist to use an app to get around all mapped areas of the city where its fleet operates. The Wall Street Journalreported that Cruise and GM hope to put self-driving taxis into usage tests with ride-sharing company Lyft, with the eventual goal of creating an on-demand network of driverless cars.
Building on the progress it’s made so far, Cruise earlier this year announced a partnership with DoorDash to pilot food and grocery delivery in San Francisco this year for select customers. And it’s making progress toward its fourth-generation car, which features automatic doors, rear seat airbags, and other redundant systems, and it lacks a steering wheel.
Testing and safety
Why the focus on San Francisco? Cruise argues that in densely populated cities, difficult maneuvers (like crossing into multiple lanes of oncoming traffic) happen quite often. Moreover, it points out that San Francisco offers more people, cars, and cyclists to contend with — about 17,246 people per square mile, or five times greater density than in Phoenix.
“Testing in the hardest places first means we’ll get to scale faster than starting with the easier ones,” Vogt explained in a blog post. “Based on our experience, every minute of testing in San Francisco is about as valuable as an hour of testing in the suburbs.”
For instance, Cruise’s Bolts encounter emergency vehicles almost 47 times as frequently in San Francisco as in more suburban environments like Scottsdale and Phoenix, and road construction 39 times more often, cyclists 16 times as often, and pedestrians 32 times as often. They’ve navigated in and around six-way intersections with flashing red lights in all directions and people moving pallets through the street of Chinatown, not to mention bicyclists who cut into traffic without the right of way and construction zones delineated by cones or flares.
Above: The mobile app for Cruise Anywhere.Image Credit: GM
“Just driving along in a stretch of road, whether it’s in the real world or in simulation, is not going to give you a huge amount of data,” said Macneil. “One of the reasons why we exist in San Francisco is because we encounter pedestrians, cyclists, construction zones, emergency medical, and all of these things just way more [often] … It’s critically important that we’re testing our cars and combing our real-world driving with our simulations, and with both of those looking to get a lot of coverage of what type of situations they’re encountering.”
The data seems to bear out that assertion. Last year, Cruise logged 5,205 miles between disengagements (instances when a safety driver intervened) in California, a substantial improvement over 2017’s 1,254 miles per disengagement.
Here’s how its average of 0.19 disengagements per 1,000 miles compared with others:
Waymo: 0.09 disengagements per 1,000 miles
Zoox: 0.50 disengagements per 1,000 miles
Nuro: 0.97 disengagements per 1,000 miles
Pony.ai: 0.98 disengagements per 1,000 miles
Assuming Cruise’s tech works as promised, it could be a godsend for the millions of people who risk their lives every time they step into a car. About 94% of car crashes are caused by human error, and in 2016, the top three causes of traffic fatalities were distracted driving, drunk driving, and speeding.
But will it be enough to convince a skeptical public?
Three separate studies last summer — by the Brookings Institution, think tank HNTB, and the Advocates for Highway and Auto Safety (AHAS) — found that a majority of people aren’t convinced of driverless cars’ safety. More than 60% said they were “not inclined” to ride in self-driving cars, almost 70% expressed “concerns” about sharing the road with them, and 59% expected that self-driving cars will be “no safer” than human-controlled cars.
They have their reasons. In March 2018, Uber suspended testing of its autonomous Volvo XC90 fleet after one of its cars struck and killed a pedestrian in Tempe, Arizona. Separately, Tesla’s Autopilot driver-assistance system has been blamed for a number of fender benders, including one in which a Tesla Model S collided with a parked Culver City fire truck. (Tesla temporarily stopped offering “full self-driving capability” on select new models in early October 2018.)
The Rand Corporation estimates that autonomous cars will have to rack up 11 billion miles before we’ll have reliable statistics on their safety — far more than the roughly 2 million miles the dozens of companies testing self-driving cars in California logged last year. For his part, Macneil believes we’re years away from fully autonomous cars that can drive in most cities without human intervention, and he says that even when the industry does reach that point, it’ll be the first of many iterations to come.
“When you put the rates of improvement at the macro scale and you look at the entire industry, once we get the full self-driving cars on the road that have no safety driver in them and serving passengers, that’s just the first version, right?” he said. “There’s still an endless array of different weather conditions to handle, and different speeds, different situations, long-distance driving, and driving in snow and rain.”
Competition and unexpected detours
For all of its successes so far, Cruise has had its fair share of setbacks.
It backtracked on plans to test a fleet of cars in a five-mile square section in Manhattan, and despite public assurances that its commercial driverless taxi service remains on track, it’s declined to provide timelines and launch sites.
In more disappointing news for Cruise, the firm drove less than 450,000 collective miles all of last year in California, falling far short of its projected one million miles a month. (Cruise claims that the initial target was based on “expanding [its] resources equally across all of [its] testing locations,” and says that it’s instead chosen to prioritize its resources in complex urban environments.) For the sake of comparison, Alphabet’s Waymo, which was founded about four years before Cruise, has logged more than 10 million autonomous miles to date.
In a report last year citing sources “with direct knowledge of Cruise’s technology,” The Information alleged that Cruise’s San Francisco vehicles are still repeatedly involved in accidents or near-accidents and that it’s likely a decade before they come into wide use in major cities. Anecdotally, one VentureBeat reporter experienced a close call while crossing the road in front of a Cruise test vehicle in San Francisco.
Then, there’s the competition to consider.
Cruise faces the likes of Ike and Ford, the latter of which is collaborating with Postmates to deliver items from Walmart stores in Miami-Dade County, Florida. There’s also TuSimple, a three-year-old autonomous truck company with autonomous vehicles operating in Arizona, California, and China, as well as venture-backed Swedish driverless car company Einride. Meanwhile, Paz Eshel and former Uber and Otto engineer Don Burnette recently secured $40 million for startup Kodiak Robotics. That’s not to mention Embark, which integrates its self-driving systems into Peterbilt semis (and which launched a pilot with Amazon to haul cargo), as well as Tesla, Aptiv, May Mobility, Pronto.ai, Aurora, NuTonomy, Optimus Ride, Daimler, and Baidu, to name a few others.
Vogt believes that Cruise’s advantage lies in its distributed real-world and simulated training process, which he claims will enable it to launch in multiple cities simultaneously. In a GM investor meeting last year, Vogt conceded that the cars might not match human drivers in terms of capability — at least not at first. But he said that they should quickly catch up and then surpass them.
“Building a new vehicle that has an incredible user experience, optimal operational parameters, and efficient use of space is the ultimate engineering challenge,” he wrote in a recent Medium post. “We’re always looking for ways to accelerate the deployment of self-driving technology, since it’s inherently good in many different ways … We’re going to do this right.”
A look at how much bigger supercomputers can get, and the benefits on offer for healthcare and the environment
12th November 2018 was a big day in the upper echelons of computing. On this date, during the SC18 supercomputing conference in Dallas, Texas, the much-anticipated Top500 list was released.
Published twice a year since 1993, this is the list of the world’s 500 fastest computers, and a place at the top of the list remains a matter of considerable national pride. On this occasion, the USA retained the top position and also gained second place, following several years in which China occupied one or both of the top two places.
Commonly referred to as supercomputers, although the term high-performance computers (HPC) is often used by those in the know, these monsters bear little resemblance to the PCs that sit on our desks.
The world’s fastest computer is called Summit and is located at the Oak Ridge National Laboratory in Tennessee. It has 2,397,824 processor cores, provided by 22-core IBM Power9 processors clocked at 3.07GHz and Nvidia Volta Tensor Core GPUs, 10 petabytes (PB) of memory (10 million GB) and 250PB of storage.
Summit – today’s fastest supercomputer at the Oak Ridge National Laboratory in Tennessee – occupies a floor area equivalent to two tennis courts
While these sorts of figures include the statistics we expect to see when describing a computer, it’s quite an eye-opener when we take a look at some of the less familiar figures. All this hardware is connected together by 300km of fibre-optic cable. Summit is housed in 318 large cabinets, which together weigh 310 tonnes, more than the take-off weight of many large jet airliners. It occupies a floor area of 520 square metres, the equivalent of two tennis courts. All of this consumes 13MW of electrical power, which is roughly the same as 32,000 homes in the UK. The heat generated by all this energy has to be removed by a cooling system that pumps 18,000 litres of water through Summit every minute.
Oak Ridge is keeping tight-lipped about how much all this costs; estimates vary from $200m to $600m.
So what does all this hardware, and all this money, buy you in terms of raw computing power? The headline peak performance figure quoted in the Top500 list is 201 petaflops (quadrillion floating point operations per second).
Each of Summit’s 4,608 nodes contain six GPUs and two 22-core Intel Power9 CPUs, plus some serious plumbing to keep the hardware from frying
Given that Summit contains over two million processor cores, it will come as no surprise that it’s several thousand times faster than the PCs we use on a day-to-day basis. Even so, a slightly more informed figure may be something of a revelation. An ordinary PC today will probably clock up 50-100 gigaflops, so the world’s fastest computer will, in all likelihood, be a couple of million times faster than your pride and joy, perhaps more.
It doesn’t stop there, either. While it didn’t meet the stringent requirements of the Top500 list, it appears Summit has gone beyond its official figure by carrying out some calculations in excess of one exaflops using reduced precision calculations. Oak Ridge says that this ground-breaking machine takes the USA a step closer to its goal of having an exascale computing capability – that is, performance in excess of one exaflops – by 2022.
The identity of the first supercomputer is a matter of some debate. It appears that the first machine that was referred to as such was the CDD 6600, which was introduced in 1964. The ‘super’ tag referred to the fact that it was 10 times faster than anything else that was available at the time. See related UK’s supercomputing industry ‘needs to collaborate to survive Brexit’New supercomputers planned at three UK universitiesEU to spend €1 billion on supercomputing strategy
The question of what processor it used was meaningless. Like all computers of the day, the CDC 6600 was defined by its CPU, and the 6600’s CPU was designed by CDC and built from discrete components. The computer cabinet was a novel X-shape when viewed from above, with this geometry minimising the lengths of the signal paths between components to maximise speed. Even in these early days, however, an element of parallelism was employed. The main CPU was designed to do only basic arithmetic and logic instructions, thereby limiting its complexity and increasing its speed. Because of the CPU’s reduced capabilities, however, it was supported by 10 peripheral processors that handled memory access and input/output.
Following the success of the CDC 6600 and its successor the CDC 7600, designer Seymour Cray left CDC to set up his own company, Cray Research. This name was synonymous with supercomputing for over two decades. Its first success was the Cray-1.
Designed, like the CDC 6600, to minimise wiring lengths, the Cray-1 had an iconic C-shaped cylindrical design with seats around its outside, covering the cooling system. Its major claim to fame was that it introduced the concept of the vector processor, a technology that dominated supercomputing for a substantial period, and still does today in the sense that ordinary CPUs and GPUs now include an element of vector processing.
The Cray-1 supercomputer broke new ground when it launched in 1975, thanks to its vector processor and iconic design
In this approach to parallelism, instructions are provided that work on several values in consecutive memory locations at once. Compared to the more conventional approach, this offers a significant speed boost by reducing the number of memory accesses, something that would otherwise represent a bottleneck, and increasing the number of results issued per clock cycle. Some vector processors also had multiple execution units and, in this way, were also able to offer performance gains in the actual computation.
The next important milestone in HPC came with the Cray X-MP in 1984. Like the Cray-1, it had a vector architecture, but where it broke new ground was in the provision of not one, but two vector processors. Its successor, the Cray Y-MP, built on this foundation by offering two, four or eight processors. And where Cray led, others followed, with companies eager to claim the highest processor count. By 1995, for example, Fujitsu’s VPP300 had a massive 256 vector processors.
Meanwhile, Thinking Machines had launched its own CM-1 a few years earlier. While the CPU was an in-house microprocessor, which was much less powerful than the contemporary vector designs, it was able to include many more of them – up to 65,536.
The CM-1 wasn’t able to compete with parallel vector processor machines for pure speed, but it was ahead of its time in the sense that it heralded the next major innovation in supercomputing – the use of vast numbers of off-the-shelf processors. Initially, these were high-performance microprocessor families.
Launched in 1995, the Cray T3E was a notable machine of this type, using DEC’s Alpha chips. And while supercomputers with more specialised processors such as the IBM Power and Sun SPARC are still with us today, by the early 2000s, x86 chips like those in desktop PCs were starting to show up. Evidence that this was the architecture of the future was underlined in 1997, when ASCI Red became the first computer to head the Top500 list that was powered by x86 processors. Since then, the story has been one of an ever-growing core count – from ASCI Red’s 9,152 to several millions today – plus the addition of GPUs to the mix.
Innovation at the top
A cursory glance at the latest Top500 list might suggest there’s little innovation in today’s top-end computers. A few statistics will illustrate this point. The precise processor family may differ, and the manufacturer may be Intel or AMD, but of the 500 computers, 480 use x86 architecture processors. The remaining 20 use IBM Power or Sun SPARC chips, or the Sunway architecture which is largely unknown outside China.
Operating systems are even less varied. Since November 2017, every one of the top 500 supercomputers has run some flavor of Linux; a far cry from the situation in November 2008, when no fewer than five families of the operating system were represented, including four Windows-based machines.
The Isambard supercomputer is being jointly developed by the Met Office, four UK universities and Cray
So does a place at computing’s top table increasingly depend on little more than how much money you have to spend on processor cores and, if so, will innovation again play a key role? To find out, we spoke to Andrew Jones, VP HPC Consulting & Services at NAG, an impartial provider of HPC expertise to buyers and users around the world.
Initially, Jones took issue with our assertion that innovation no longer plays a role in supercomputer development. In particular, while acknowledging that the current Top500 list is dominated by Intel Xeon processors, and that x86 is likely to rule the roost for the foreseeable future, he stresses that there’s more to a supercomputer than its processors.
“Performance for HPC applications is driven by floating point performance. This in turns needs memory bandwidth to get data to and from the maths engines. It also requires memory capacity to store the model during computation and it requires good storage performance to record state data and results during the simulation,” he explains.
And another vital element is needed to ensure all those cores work together effectively.
“HPC is essentially about scale – bringing together many compute nodes to work on a single problem. This means that HPC performance depends on a good interconnect and on good distributed parallel programming to make effective use of that scale,” Jones adds.
As an example, he refers to the recently announced Cray Slingshot as one case of a high-performance interconnect technology. It employs 64 x 200Gbit/s ports and 12.8Tbit/s of bi-directional bandwidth. Slingshot supports up to 250,000 endpoints and takes a maximum of three hops from end to end.
Jones concedes that many HPC experts believe the pace and quality of innovation has slowed due to limited competition but, encouragingly, he expressed a positive view about future developments.
Serious supercomputers sometimes need serious displays, such as EVEREST at the Oak Ridge National Laboratory
“We have a fairly good view of the market directions over the next five years, some from public vendor statements, some from confidential information and some from our own research,” he says.
“The core theme will continue to be parallelism at every level – within the processor, within the node, and across the system. This will be interwoven with the critical role of data movement around the system to meet application needs. This means finding ways to minimise data movement, and deliver better data movement performance where movement is needed.”
However, this doesn’t just mean more of the same, and there will be an increasing role for software.
“This is leading to a growth in the complexity of the memory hierarchy – cache levels, high bandwidth memory, system memory, non-volatile memory, SSD, remote node memory, storage – which in turn leads to challenges for programmers,” Jones explains.
“All this is true, independent of whether we are talking about CPUs or GPUs. It is likely that GPUs will increase in relevance as more applications are adapted to take advantage of them, and as the GPU-for-HPC market broadens to include strong offerings from a resurgent AMD, as well as the established Nvidia.”
Beyond five years, Jones admitted, it’s mostly guesswork, but he has some enticing suggestions nonetheless.
“Technologies such as quantum computing might have a role to play. But probably the biggest change will be how computational science is performed, with machine learning integrating with – not replacing – traditional simulations to deliver a step change in science and business results,” he adds.
Much of what we’ve covered here sounds so esoteric that it might appear of little relevance to ordinary people. So has society at large actually benefitted from all the billions spent on supercomputers over the years?
A report by the HPC team at market analysis company IDC (now Hyperion Research) revealed some of the growing number of supercomputer applications that are already serving society. First up is a range of health benefits, as HPC is better able to simulate the workings of the human body and the influence of medications. Reference was made to research into hepatitis C, Alzheimer’s disease, childhood cancer and heart diseases.
The Barcelona Supercomputing Centre, located in a 19th-century former church, houses MareNostrum, surely the world’s most atmospheric supercomputer
Environmental research is also benefiting from these computing resources. Car manufacturers are bringing HPC to bear on the design of more efficient engines, and researchers are investigating less environmentally damaging methods of energy generation, including geothermal energy, and carbon capture and sequestration in which greenhouse gasses are injected back into the earth.
Other applications include predicting severe weather events, generating hazard maps for those living on floodplains or seismic regions, and detecting sophisticated cyber security breaches and electronic fraud.
Reading about levels of computing performance that most people can barely conceive might engender a feeling of inadequacy. If you crave the ultimate in computer power, however, don’t give up hope.
That CDC 6600, the first ever supercomputer, boasted 3 megaflops, a figure that seems unimaginably slow compared to a run-of-the-mill PC today. And we’d have to come forward in time to 2004, when the BlueGene/L supercomputer topped the Top500 list, to find a machine that would match a current-day PC for pure processing speed.
This is partly the power of Moore’s Law and, if this were to continue into the future – although this is no longer the certainty it once was – performance equal to that of today’s supercomputers may be attainable sooner than you might have expected.
NEW-PRODUCT NEWS ANALYSIS: Even the most technically astute engineers likely don’t have the skills to get an AI-ready system up and running quickly. The majority of organizations looking to embark on AI should look to an engineered system to de-risk the deployment.
This week at the NVIDIA GPU Technology Conference (GTC), flash storage vendor Pure Storage announced an extension to its engineered systems for artificial intelligence. For those not familiar with the term, an engineered system is a turnkey solution that brings together all the technology components required to run a certain workload.
The first example of this was the vBlock system introduced by VCE, a joint venture between VMware, Cisco Systems and EMC. It included all the necessary storage, networking infrastructure, servers and software to stand up a private cloud and took deployment times from weeks or even months to just a few days.
During the past decade, compute platforms have become increasingly disaggregated as companies desired the freedom to pick and choose which storage, network or server vendor to use. Putting the components together for low performance workloads is fairly easy. Cobbling together the right piece parts for high-performance ones, such as private cloud and AI is very difficult–particularly in the area of tuning the software and hardware to run optimally together. Engineered systems are validated designs that are tested and tuned for a particular application.
Airi Comes in Three Versions
Pure’s platform is known as Airi, which was announced at GTC 2019 and uses NVIDIA DGX servers, Arista network infrastructure and Pure Storage Flashblades. There are currently three versions of Airi that range from 2 PFlops of performance to 4 PFlops and 119 TB of flash to 374 TB. All three versions of Airi are single chassis systems. The new ones announced at GTC are multi-chassis systems where multiple Airis can be daisy chained together to create a single, larger logical unit.
Both can accommodate up to 30×17 TB blades. One version uses up to 9 NVIDIA DGX-1 systems for a total compute capacity of 9 Pflops. The other can be loaded up with up to 3 NVIDIA DGX-2 systems for a total processing capability of 6 Pflops per unit. The new units use Mellanox’s (recently acquired by NVIDIA) 100 Gig low-latency Ethernet.
The use of Mellanox Ethernet may seem strange, because it’s the market leader in Infiniband, which often is used to interconnect servers. Its low-latency Ethernet has performance characteristics that are closed to Infiniband, and scaling out Ethernet is simpler with it. The new Airi systems can be scaled out to 64 racks with a leaf-spine network for a massive amount of AI capacity.
The leaf-spine network architecture is the best network topology for multi-chassis, because it offers consistent performance, high bandwidth, rapid scale and high availability. Companies can use the new Airi systems to start small with a single chassis and then scale out as required.
AI-Optimized Version of Engineered-System Flashstack
Also, at GTC 19, Pure Storage announced an AI-optimized version of Flashstack, which is its engineered system using Cisco UCS servers and Nexus data center switches. The new Flashstack for AI uses the Cisco UCS C480 M5 ML AI servers that is optimized for deep learning. The server contains up to eight NVIDIA Tesla V100 GPUs that use the NVLink interconnect to make the eight processors work like a single, massive GPU. Flashstack uses Cisco’s 100 Gig Nexus storage and Pure’s Flashblade system.
The company does have other Flashstack systems, but those were not optimized. Currently the system can’t be set up in multi-chassis configuration the way Airi can, but that’s likely coming in the not-too-distant future.
Cisco and Pure have a strong partnership and offer a unique way of simplifying the entire data pipeline for AI. Cisco has a wide range of servers for every step in the AI cycle. The UCS C220 is ideal for data collection, and the UCS C240 is optimal for the clean and transform phase. As the graphic shows, a single Flashblade data hub can share the entire data set across the AI lifecycle.
[To see a larger version of the graphic at upper left, right-click on it and select “View Image.”]
The combination of NVIDIA, Mellanox and Pure Storage or even Cisco and Pure have hundreds of possible configuration knobs and levers to tune. While the hardware settings might look complex, they pale in comparison to the AI software. As an example, TensorFlow alone has more than 800 configuration parameters. With Airi and Flashstack, all the heavy lifting has been done and customers can get the product up and running in just a few days. In highly competitive industries, this could make the difference between being a market leader and a laggard.
The AI era has arrived, and IT professionals need to be ready. Even the most technically astute engineers likely don’t have the skills to get an AI ready system up and running quickly. The majority of organizations looking to embark on AI should look to an engineered system to de-risk the deployment.
Hot storage tech trends for 2019 include cheaper, denser flash and multi-cloud data management that can benefit the consumer and create better performance and data protection.
These are technologies storage pros can use here and now.
Sit back, relax and feel free to agree or disagree as we, the editors and writers at Storagemagazine and SearchStorage, lay out the latest trends. This year’s list includes cheaper and denser flash, NVMe-oF, AI and machine learning, multi-cloud data management and composable infrastructure. Read on and find out why these made the grade for hot data storage technology trends 2019.
Denser and cheaper flash
During 2019, a variety of data storage technology trends should work in the consumer’s favor to make enterprise flash storage drives cheaper and denser than ever before.
For starters, the NAND flash that has curbed price declines during the past two years has reversed direction into an oversupply situation. As NAND prices plummet, SSD costs could fall in tandem, since flash chips represent the bulk of the cost of an SSD.
Greg Wong, founder and analyst at Forward Insights, said buyers should expect to see price declines of 30% or more for both the NAND chips and SSDs next year.
Another possibility is manufacturers could sell higher-capacity SSDs for the same price they previously charged for lower-capacity drives, said Jim Handy, general director and semiconductor analyst at Objective Analysis. Handy said prices in early 2019 could be half of what they were at the start of 2018. Estimates vary among market research firms, however.
Trendfocus forecast the price per gigabyte of NAND will drop from 19.5 cents in 2018 to 13.7 cents next year, 11.2 cents in 2020, 8.8 cents in 2021 and 7 cents in 2022, according to John Kim, a vice president at the market research firm based in Cupertino, Calif.
“Lower NAND prices equate to lower SSD prices, and on the enterprise side where capacities are so high, it makes a significant difference,” Kim said. “Hard disk drives have a clear advantage in that dollar-per-gigabyte metric. But SSD companies are closing the gap on that now.”
The shift to denser 3D NAND technology is also helping to drive down the cost of flash. Manufacturers shifted from 32-layer to 64-layer 3D flash in 2018, and they will slowly start ramping to 96-layer in 2019 and beyond. Additional layers enable higher storage capacity in the same physical footprint as a standard SSD using older two-dimensional technology.
Wong predicted 64-layer will continue to dominate in 2019, as NAND manufacturers slow their migrations to 96-layer technology due to the oversupply. Some manufacturers already started promoting their 96-layer technology in 2018, though.
Yet another density-boosting trend on the NAND horizon is quadruple-level cell (QLC) flash chips that can store 4 bits of data per memory cell. Triple-level cell flash (TLC) 3D NAND will continue to factor into the vast majority of enterprise SSDs next year. But Wong predicted QLC will creep from less than 1% of the enterprise SSD market in 2018 to 4.1% in 2019. This will be driven by hyperscale users.
Gartner has predicted that about 20% of the total NAND flash bit output could be QLC technologyby 2021, and the percentage could hit 60% by 2025.
QLC flash doesn’t offer the write endurance often needed for high-transaction databases and other high-performance applications. But QLC 3D NAND offers a lower-cost flash alternative than TLC and could challenge cheaper HDDs used for read-intensive workloads in hyperscale environments.
Don Jeanette, a vice president focusing on NAND-SSDs at Trendfocus, said hyperscale deployments will also continue to fuel the enterprise PCIe SSD market and pressure the SATA- and SAS-based SSD market. Analysts expect PCIe SSDs using latency-lowering NVMe will be one of the major data storage technology trends in 2019.
AI and machine learning storage analytics
AI and machine learning have traditionally been used for ransomware detection in storage and backup, but the tech has also been used to generate intelligent, actionable recommendations. Analytics driven by AI and machine learning can help with operations such as predicting future storage usage; flagging inactive or infrequently used data for lower storage tiers; and identifying potential compliance risks, like personally identifiable information. This year, we predict AI- and machine learning-powered products designed to manage and analyze vast amounts of data will become more commonplace, especially as data and IT complexity continue to grow rapidly.AI and machine learning storage analytics will see the most use in automating day-one deployments.
In 2018, vendors launched products using AI and machine learning to guide IT decision-making. Imanis Data Management Platform 4.0’s SmartPolicies can generate optimized backup schedules based on the desired recovery point objective set by the user. Commvault planned to update its interface with embedded predictive analyses regarding storage consumption by the end of 2018. Igneous Systems announced DataDiscover and DataFlow, which index and categorize unstructured data and move it around intelligently. Array vendors including Pure Storage, NetApp and Dell EMC Isilon support Nvidia graphics cards to accelerate analytics for deep learning.
Edwin Yuen, an analyst at Enterprise Strategy Group, said now that AI and machine learning have matured and permeated into other areas of IT, people are starting to warm up to the idea of using them to tackle a fundamental storage challenge: the explosive growth and complexity of big data.
“In order to match up with the growing complexity of IT, it’s not about adding a little more personnel or adding slightly more automation or tools,” Yuen said. “People understand now that they need to make a quantum jump up. We’re going from a bicycle to a car here.”
Human bandwidth is the next bottleneck, according to Yuen. Right now, AI and machine learning storage analytics software generates recommendations for administrators, which they then have to approve. Yuen imagines the next step will take things further, skipping the approval process and removing the need for user input except under very anomalous circumstances.
“The stream of data has come to the point of, ‘Can a human being really process that much data? Or even approve the processing of that much data?'” Yuen said.
AI and machine learning storage analytics will see the most use in automating day-one deployments, according to Yuen. This is because the technology is especially useful in optimizing storage, specifically when it comes to tiering. One of the major changes the adoption of AI and machine learning will bring in IT operations is the need to redefine parameters for storage tiers, thereby laying the groundwork for where algorithms deploy the data.
“Figuring out where everything goes based on the usage is going to get more complicated,” Yuen said. “We’re going to have to make sure storage environments are going to express the parameters that [they have].”
Many organizations think of storage tiers in a binary way, such as inactive data vs. active or higher-performance tiers vs. low. But Yuen believes if software automatically handles tiering intelligently, it opens the door for expanding beyond a two-tier system.
“It would make the most logical sense if storage can go to five or six different platforms, but that’s complicated,” Yuen said. “But if AI and [machine learning] can do that for you, wouldn’t you want to take advantage of all those different options based on what the cost/performance metric is?”
Flash arrays with NVMe fabric capabilities are showing up in mainstream enterprises faster than experts predicted, even if broad adoption by data centers is several years away.
By 2021, solid-state arrays that use external NVMe-oF to connect to servers and SANs will generate about 30% of storage revenue, according to analyst firm Gartner. That would represent a significant jump above Gartner’s 1% estimate for 2018.
IDC pegged the number even higher, claiming NVMe and NVMe-oF-connected flash arrays will account for roughly half of all external storage revenues by 2021.
NVMe-oF has promising potential to deliver extremely high bandwidth with ultra-low latency, making it one of the hot data storage technology trends of 2019. Plus, the ability to share flash storage systems among multiple server racks would be compelling to organizations with large Oracle Real Application Clusters or other database deployments.
“One of the differences between NVMe and SCSI is it allows you to address storage devices through direct memory access. The SSDs of old were basically PCIe-compatible that plugged into a host server. They didn’t need a [host bus adapter] to talk to storage devices,” said Eric Burgener, a research vice president at IDC.
NVMe-oF extends direct memory access capabilities over a switched fabric to create shared storage with latency as consistent as PCIe flash in a commodity server, Burgener said.
“NVMe over fabrics allows you to share extremely high-performance and costly NVMe storage across many more servers. Any server connected to the switched fabric has access to the storage. With PCIe SSDs in a server, it’s only efficient for that server to access [the storage],” Burgener said.
The NVM Express group initiated NVMe fabric development in 2014, aiming to extend it to Ethernet, Fibre Channel (FC) and InfiniBand, among other technologies. Connectivity between NVMe-oF hosts and target device remains an ongoing project.
NVMe fabric transports are under development for FC-NVMe and various remote direct memory access protocol technologies, including InfiniBand, RDMA over Converged Ethernet and Internet Wide-Area RDMA Protocol.
In 2018, storage vendors started to catch up to the market, bringing a raft of NVMe products to market. Most new NVMe fabric-based rollouts were from legacy storage vendors — most notably, Dell EMC, Hewlett Packard Enterprise (HPE) and NetApp — although startups continue to jockey for position.
End-to-end NVMe arrays are generally defined as rack-scale flash — meaning systems with an NVMe fabric extending from back-end storage directly to applications. Rack-scale systems are mostly the province of startups. For the most part, legacy vendors swapped out SAS-connected SSDs on their all-flash arrays with NVMe SSDs, which gives some performance improvement, but is largely considered more of a retrofit design.
Why do we think NVMe-oF flash is, or will be, hot? The answer is in how the technology accelerates data transfer between server hosts and target storage devices. Legacy SCSI command stacks never envisioned the rise of SSD. NVMe flash sidesteps delays created by traditional network hops, allowing an application to connect directly to underlying block-based storage via a PCIe interface.
The industry developed the NVMe protocol with flash in mind. It is optimized to efficiently govern flash, leading to a consensus that NVMe will eventually supersede Advance Host Controller-designed SATA SSDs. NVMe draws data closer to the processor, reducing I/O overhead while boosting IOPS and throughput.
“There are two levels of engagement with NVMe. No. 1 is just connecting with it and using it, but that doesn’t give a performance increase unless you rearchitect your data path and the way you lay down data on the flash media,” said Ron Nash, CEO of Pivot3, a vendor that implemented NVMe flash to its Acuity hyper-converged system. “There is a big difference in how you optimize laying down data on spinning disk versus flash media.”
Due to the way NVMe handles queues, it allows multiple tenants to query the same flash device, which gives higher scalability than traditional SAS and SATA SSDs. The NVMe-oF specification marks the next phase of technological development.
Disaggregation is another advantage of NVMe-oF, enabling compute and storage to scale independently. This capability is increasingly important for allocating resources more efficiently for IoT, AI and other latency-sensitive workloads.
One thing to keep in mind is the hype cycle for NVMe. If they’re using flash at all, most companies are using solid-state technology in a server or an all-flash array. Only a curious handful of IT shops deploy NVMe on a large scale, said Howard Marks, founder and chief scientist at DeepStorage.net.
“NVMe over fabrics is a high-speed solution for a specific problem, and that problem is probably [addressed] within a rack or couple of adjacent racks and a switch,” Marks said.
“NVMe over fabrics doesn’t pose a serious threat to legacy SCSI yet,” Marks added. “If it does, people will then switch whole hog to NVMe, and, at that point, all this becomes important,” making it one of the most important data storage technology trends today.
Multi-cloud data management
According to Enterprise Strategy Group research, 81% of companies use more than one cloud infrastructure provider, whether for IaaS or PaaS. And 51% use three or more.
“People are inherently multi-cloud now,” Enterprise Strategy Group’s Yuen said.
Organizations are using multiple clouds for different applications, rather than spreading one kind of workload across clouds, Yuen said.
Multi-cloud benefits include performance, data protection and ensured availability, said Steven Hill, an analyst at 451 Research.
“We contend that a key motivator may be cloud-specific applications and services that may only be cost-effective for data that is stored on that particular platform,” Hill said.
Data management, though, can be challenging when it spans multiple clouds, Hill said.
“It’s becoming increasingly important to establish policy-based data management that uses a common set of rules for data protection and access control regardless of location, but this is easier said than done because of the differences in semantics and format between cloud object stores,” Hill added.
“Planning a multi-cloud data management strategy that supports uniform data management across any cloud platform from the start combines freedom of choice with the governance, security and protection needed to remain compliant with new and evolving data privacy regulations,” he said.
To that end, Hill believes new laws like the European Union’s GDPR and the California Consumer Privacy Act will create a sense of urgency for policy-based data management, no matter where the data lives.
One major issue in dealing with multiple clouds is to understand what each contains and the differences between them, according to Yuen. For example, each one has different terminology, service-level agreements and billing cycles.
“It’s difficult to keep track of all those,” Yuen said, but that’s where the opportunity comes in for vendors.
There are many multi-cloud data management products in the market, several of which launched this year. Here’s a sampling:
Panzura expanded into multi-cloud data management with its Vizion.ai software as a service (SaaS) designed to search, analyze and control data on and off premises.
SwiftStack 1space multi-cloud introduced a single namespace for its object- and file-based storage software to make it easier to access, migrate and search data across private and public clouds.
Scality Zenko helps place, manage and search data across private and public clouds. Use cases include data capture and distribution from IoT devices.
Startup NooBaa — acquired by Red Hat in November 2018 — provides the ability to migrate data between AWS and Microsoft Azure public clouds. It added a single namespace for a unified view of multiple data repositories that can span public and private clouds.
Rubrik Polaris GPS, the first application on the vendor’s new Polaris SaaS management platform, provides policy management and control of data stored on premises and across multiple clouds.
Cohesity also added a SaaS application, Helios, that manages data under control of the vendor’s DataPlatform software, whether on premises or in public clouds.
Yuen noted many companies use more than one vendor for multi-cloud data management, and they’re OK with it.
“It isn’t necessarily one-size-fits-all, but that’s not a bad thing, potentially,” Yuen said.
For multi-cloud data management tools to improve, Yuen said he thinks they will need more insight, from a performance point of view. Tools could offer recommendations for tiering and usage levels across cloud providers. That analytical element is part of the next step in multi-cloud data management.
Right now, Yuen said, “we’re in step one.”
Composable infrastructure may be one of the newest architectures in the IT administrator’s arsenal, but it isn’t brand new. Even though it is a few years old, and we even cited it as one of the data storage technology trends to keep an eye on last year (it didn’t quite live up to the hype in 2018), more vendors are putting out composable infrastructure products than ever before.
Like its IT cousins, converged infrastructure and hyper-converged infrastructure (HCI), composable infrastructure takes physical IT resources such as compute and storage and virtualizes them, placing all of the now-virtualized capacity into shared pools. However, composable infrastructure goes one step further, allowing admins to create not only virtual machines from these pools, as in HCI, but even applying them to physical servers or containers.
Unlike HCI, which uses a hypervisor to manage the virtual resources, composable infrastructure uses APIs and management software to both recognize and aggregate all physical resources into the virtual pools and to provision — or compose — the end IT products.
Early entrants into the market, going back to 2014, were Cisco and HPE. Recently, newer storage and HCI companies have entered or expressed interest in entering the composable infrastructure market, however. In September 2018, Pivot3’s Nash said the company was moving toward providing composable infrastructure.
In an interview with SearchConvergedInfrastructure at the time, Nash said, “The direction we’re taking is to make the composable stuff into a software mechanism. That software layer will manipulate the platform underneath to [provision] the right type of units to provide the right service level at the right cost level and right security level.”
In August 2018, storage array vendor Kaminario announced it was adding software it calls Kaminario Flex to its NVMe all-flash arrays to enable them to be used in composable infrastructure implementations. But early player HPE isn’t sitting on its Synergy composable product, announcing in May 2018 that its purchase of Plexxi would allow it to add networking capabilities to the virtual pools of resources in its Synergy composable infrastructure platform.
The $180 Deco M4 can cover up to 5,500 square feet.
TP-Link is expanding its range of affordable mesh WiFi systems with the new Deco M4. For $180, you get a three-pack of white, cylindrical WiFi hubs that can cover up to 5,500 square feet of space. Of course, the more access points you have, the more nooks and crannies of your home you can cover.
Aside from being cheaper than offerings from Amazon-ownedEero, Samsung, and Google, TP Link is also promising ease-of-use. Knowing it will probably attract newbies at this entry-level price point, it says set-up is a breeze using the Deco app for iOS or Android — which can help you find the best place to put Deco nodes throughout your pad. The system can also overcome network hiccups by automatically reconfiguring itself if one Deco node drops out, hopefully giving you an uninterrupted WiFi connection.
The low cost means it lacks some of the eye-catching perks offered by pricier counterparts, like the AI-based networking optimization on Samsung’s $248 SmartThings system. But you do get adaptive path selection (keeping the access points on the fastest data stream), Alexa and IFTTT compatibility, and built-in parental controls. The TP-Link Deco M4 Mesh WiFi System is available online at Amazon and other major retailers.
Part two of two: While all-flash is mainstream, NVMe is an option as disk replacement for most suppliers, while NetApp leads the way with NVMe end-to-end to hosts
As noted in the first piece, the key products of the big five storage array makers – Dell EMC, HPE, Hitachi, IBM, and NetApp – are primarily offered as all-flash storage, with hybrid flash and spinning disk as options too.
Standard spec/features include input/output per second (IOPS) that run into the millions and capacities that go to several tens or beyond of petabytes (PB), Fibre Channel and iSCSI connectivity to hosts, with mainframe in a small number of products (notably from IBM and Hitachi in this article), plus Dell EMC in the first. Dual controllers, replication, snapshots, encryption, thin provisioning and data reduction are also common to most.
As mentioned, when it comes to the cloud, all the array makers have some way of using the cloud as a tier, which we have looked at here.
In this article, we’ll look at Hitachi’s VSP F- and G-series, IBM’s FlashSystem, StorWize and DS arrays, plus finally NetApp’s AFF/FAS, Solidfire and E/EF-series.
The accompanying piece examined Dell EMC’s VMAX/PowerMax, Xtremio, Unity, SC, and PowerVault ME4 products, plus HPE’s 3PAR and Nimble arrays.
Hitachi Vantara’s VSP F-series all-flash enterprise SAN storage comes in three models: the F700, F900 and F1500. These are claimed to deliver 1.4, 2.4 and 4.8 million IOPS respectively.
They are enterprise-scale storage area network (SAN) arrays with capacity for 864, 1,152, and 2,304 drives respectively, which gives a raw maximum capacity of between 6PB and 8PB.
For the two smaller arrays, connectivity is via 16Gbps or 32Gbps Fibre Channel or iSCSI with 10Gbps Ethernet. The F1500 has Fibre Channel plus (mainframe) FICON, both at 8Gbps and 16Gbps, plus Fibre-Channel-over-Ethernet at 10Gbps.
A pair of midrange F-series arrays – the F350 and F370 – offer a claimed 600,000 IOPS and 1.2 million IOPS with 192 or 288 drive slots for maximum raw capacity of 2.8PB or 4.3PB. Connectivity is 16Gbps and 32Gbps Fibre Channel plus 10Gbps iSCSI/Ethernet.
Meanwhile, the VSP G350, G3700 and G900 hybrid flash arrays range from entry-level maximum capacity of a couple of PB to 35PB, depending on the drives used, which can be a mixture of 2.4TB and 6TB spinning disk as well as up to 15TB flash.
Performance ranges from 600,000 IOPS to 4.8 million. Connectivity is similar to the F-series, with mainframe connectivity in the G1500 top-of-the-range model.
FlashSystem: IBM’s FlashSystem products all come with TLC flash drives. The V9000 scales out to eight controllers and disk shelves, or from 43TB to 1.7PB, with IOPS of between 1.3 million and 5.2 million. Connectivity is up to 16Gbps Fibre Channel and 10Gbps iSCSI and Fibre Channel-over-Ethernet.
The V9100 comes with NVMe modules and claims between 2.5 million IOPS and 10 million IOPS for a clustered configuration.
Meanwhile, the FlashSystem A9000 offers TLC flash capacity from just over 100TB up to 1.2PB (both are effective, not raw figures) with up to 900,000 IOPS.
The FlashSystem A9000R comes in three size configurations: 72TB to 144TB, 170TB to 340TB, and 360TB to 720TB – all raw capacities. IOPS is 2.4 million for all three and connectivity is Fibre Channel and iSCSI to 16Gbps and 10Gbps respectively.
The FlashSystem 900 offers 13TB per 2U box with 1.1 million IOPS read-only, and around 800,000 IOPS with 70/30 read/write.
StorWize: IBM’s StorWize V5030F can house up to 760 drives (or double that in a clustered configuration), which, with 15TB flash drives, makes for around 22PB maximum capacity. IBM Spectrum Virtualize software (formerly SAN Volume Controller) allows for storage virtualisation across disparate devices.
StorWize V7000F scales up to 3,040 drives, making for about 44PB raw capacity.
Minus the F suffix, StorWize provides hybrid flash array functionality.
DS series: IBM’s DS8000F series are its all-flash arrays aimed at mainframe use cases. There are three models that come in different capacities up to 1.2PB and connectivity via 8Gbps and 16Gbps Fibre Channel and FICON.
The DS8880 is IBM’s hybrid flash SAN array family, which comes in capacities up to about 5PB, of which approximately one sixth – in terms of drive bays/card slots – can be solid state. Once again, the DS brings mainframe compatibility.
AFF/FAS: NetApp’s AFF series – All-Flash FAS – come in five models that scale from clusters of two to 24 nodes (12 HA pairs) with maximum effective capacities that run up to 700-plus TB per node and the low hundreds of PB (after data reduction) in maximum-sized clusters. Out of these the AFF800 is NVMe-equipped, and offers latency under 200µs with claimed end-to-end NVMe connectivity, including over Fibre-Channel.
NetApp’s FAS line continues as a set of hybrid flash arrays, with the FAS2700, 8200 and 9000 series. The SME/mid-size 2700 starts at 10TB and can scale to 17PB in a 24-node cluster. The 9000 scales to 176PB.
Solidfire: The Solidfire all-flash storage product that NetApp acquired in 2015 come in a 1U form factor and three models, the H610S-1, H610S-2 and H610S-4.
They each hold 12 flash drives of 960GB, 1.92TB or 3.84TB for total capacity of 11.5TB, 23TB or 46TB.
There are two all-flash arrays in the series, the EF280 and EF570, which offer 300,000 IOPS and 1 million IOPS respectively in hardware for 96 and 120 drives, plus maximum raw capacity that goes up to around 1.5PB with expansion shelves.
Meanwhile, there are two spinning disk E-series arrays, the E5724 and the E5760, with 24 and 60 drives respectively. The E5724 can scale to 120 flash drives and 180 HDDs, while the E5760 can accommodate 120 and 480 of each.
The easiest way to hobble a fast CPU is to pair it with slow storage. While your processor can handle billions of cycles per second, it spends a lot of time waiting for your drive to feed it data. Hard drives are particularly sluggish because they have moving parts. To get the optimal performance you need a good solid state drive (SSD).
Class 0 (256-bit FDE), TCG Opal 2.0, Microsoft eDrive
TCG Opal, eDrive
Hardware AES-256 Encryption; TCG Opal 2.0 SED Support
If you already know all about the specific drive types and want specific recommendations, check out our Best SSDs page. But if you don’t have a Ph.D in SSD, here are a few things you need to consider when shopping.
First, if you’re going to be shopping for an SSD deal, you’ll want to check out our feature: How to Tell an SSD Deal From a Solid-State Dud. And if you keep an eye on our Best SSD and Storage Deals page, you might snag a sweet price on an older (but still plenty fast) SATA SSD. Also, keep an eye out for deals on higher-capacity drives, like 1 or even 2TB models. That’s where there’s the most potential for great discounts.
Here are four quick tips, followed by our detailed answers to many FAQs:
Know your home computer: Find out if you have slots for M.2 drives on your motherboard and room in the chassis. If not, you may may need a 2.5-inch drive instead.
500GB to 1TB capacity: Don’t even consider buying a drive that has less than 256GB of storage. 500GB offers a good balance between price and capacity. And as 1TB drives slide toward the $100/£100 price point, they’re great, roomy options as well.
SATA is cheaper but slower: If your computer supports NVMe PCIe or Optane drives, consider buying a drive with one of these technologies. However, SATA drives are more common, cost less and still offer excellent performance for common applications.
Any SSD is better than a hard drive: Even the worst SSD is at least three times as fast as a hard drive. Depending on the workload, the performance delta between good and a great SSDs can be subtle.
How much can you spend?
Most consumer drives range from 120GB to 2TB. While 120GB drives are the cheapest, they aren’t roomy enough to hold a lot of software and are usually slower than their higher-capacity counterparts. It costs as little as $10 (£7) extra to step up from 120 to 250GB size and that’s money well spent. The delta between 250GB and 500GB drives can be slightly more, but 500GB is the sweet spot between price, performance and capacity for most users–particularly if you don’t have the budget for a 1TB model.
There are also some drives (primarily from Samsung) with capacities above 2TB. But they’re typically expensive in the extreme (well over $500/£500), so they’re really only worthwhile for professional users who need space and speed and aren’t averse to paying for it.
What kind of SSD does your computer support?
Solid-state drives these days come in several different form factors and operate across several possible hardware and software connections. What kind of drive you need depends on what device you have (or are intending on buying). If you own a recent gaming desktop or are building a PC with a recent mid-to-high-end motherboard, your system may be able to incorporate most (or all) modern drive types.
Alternatively, modern slim laptops and convertibles are increasingly shifting solely to the gum-stick-shaped M.2 form factor, with no space for a traditional 2.5-inch laptop-style drive. And in some cases, laptop makers are soldering the storage directly to the board, so you can’t upgrade at all. So you’ll definitely want to consult your device manual or check Crucial’s Advisor Tool to sort out what your options are before buying.
Which form factor do you need?
SSDs come in three main form factors, plus one uncommon outlier.
2.5-inch Serial ATA (SATA): The most common type, these drives mimic the shape of traditional laptop hard drives and connect over the same SATA cables and interface that any moderately experienced upgrader should be familiar with. If your laptop or desktop has a 2.5-inch hard drive bay and a spare SATA connector, these drives should be drop-in-compatible (though you may need a bay adapter if installing in a desktop with only larger 3.5-inch hard drive bays free).
SSD Add-in Card(AIC): These drives have the potential to be much faster than other drives, as they operate over the PCI Express bus, rather than SATA, which was designed well over a decade ago to handle spinning hard drives. AIC drives plug into the slots on a motherboard that is more commonly used for graphics cards or RAID controllers. Of course, that means they’re only an option for desktops, and you’ll need an empty PCIe x4 or x16 slot to install them. If your desktop is compact and you already have a graphics card installed, you may be out of luck. But if you do have room in your modern desktop and a spare slot, these drives can be among the fastest available (take the Intel Optane 900p, for example), due in large part to their extra surface area, allowing for better cooling. Moving data at extreme speeds generates a fair bit of heat.
M.2 SSDs: About the shape of a stick of RAM but much smaller, M.2 drives have become the standard for slim laptops, but you’ll also find them on many desktop motherboards. Some boards even have two or more M.2 slots, so you can run the drives in RAID. While most M.2 drives are 22mm wide and 80mm long, there are some that are shorter or longer. You can tell by the four or five-digit number in their names, with the first two digits representing the width and the others showing length. The most common size is labeled M.2 Type-2280. Though laptops will only work with one size, many desktop motherboards have anchor points for longer and shorter drives. The largest M.2 drives are 1 to 2TB. So, if you have a generous budget and need a ton of storage space, you should consider other form factors.
U.2 SSDs: At first glance, these 2.5-inch components look like traditional SATA hard drives. However, they use a different connector and send data via the speedy PCIe interface, and they’re typically thicker than 2.5-inch hard drives and SSDs. U.2 drives tend to be more expensive and higher-capacity than regular M.2 drives. Servers that have lots of open drive bays can benefit from this form factor.
Do you want a drive with a SATA or PCIe interface?
Strap in, because this bit is more complicated than it should be. As noted earlier, 2.5-inch SSDs run on the Serial ATA (SATA) interface, which was designed for hard drives (and launched way back in 2000), while add-in-card drives work over the faster PCI Express bus, which has more bandwidth for things like graphics cards.
M.2 drives can work either over SATA or PCI Express, depending on the drive. And the fastest M.2 drives (including Samsung’s 970 drives and Intel’s 760p) also support NVMe, a protocol that was designed specifically for fast modern storage. The tricky bit (OK, another tricky bit) is that an M.2 drive could be SATA-based, PCIe-based without NVMe support, or PCIe-based with NVMe support. That said, most fast M.2 SSDs launched in the last couple of years support NVMe
Both M.2 drives and the corresponding M.2 connectors on motherboards look very similar, regardless of what they support. So be sure to double-check the manual for your motherboard, laptop, or convertible, as well as what a given drive supports, before buying.
If your daily tasks consist of web browsing, office applications, or even gaming, most NVMe SSDs aren’t going to be noticeably faster than less expensive SATA models. If your daily tasks consist of heavier work, like large file transfers, videos or high-end photo editing, transcoding, or compression/decompression, then you might consider stepping up to an NVMe SSD. These SSDs provide up to five times more bandwidth than SATA models, which boosts performance in heavier productivity applications.
Also, some NVMe drives (like Intel’s SSD 660p) are nearing the price of SATA drives. So if your device supports NVMe and you find a good deal on a drive, you may want to consider NVMe as an option even if you don’t have a strong need for the extra speed.
What capacity do you need?
128GB Class: Stay away. These low-capacity drives tend to have slower performance, because of their minimal number of memory modules. Also, after you put Windows and a couple of games on it, you’ll be running out of space. Plus, you can step up to the next level for as little as $10/£7 more.
250GB Class: These drives are much cheaper than their larger siblings, but they’re still quite cramped, particularly if you use your PC to house your operating system, PC games, and possibly a large media library. If there’s wiggle room in your budget, stepping up at least one capacity tier to a 500GB-class drive is advisable.
500GB Class: Drives at this capacity level occupy a sweet spot between price and roominess, although 1TB drives are becoming increasingly appealing.
1TB Class: Unless you have massive media or game libraries, a 1TB drive should give you enough space for your operating system and primary programs, with plenty of room for future media collections and software.
2TB Class: If you work with large media files, or just have a large game library that you want to be able to access on the quick, a 2TB drive could be worth the high premium you pay for it.
4TB Class: You have to really need this much space on an SSD to splurge on one of these. A 4TB SSD will be quite expensive — well over $500/£600 — and you won’t have many options. As of this writing, Samsung was the only company offering consumer-focused 4TB models, in both the 860 EVO and pricier 860 Pro models.
If you’re a desktop user, or you have a gaming laptop with multiple drives and you want lots of capacity, you’re much better off opting for a pair of smaller SSDs, which will generally save you hundreds of dollars while still offering up roughly the same storage space and speed. Until pricing drops and we see more competition, 4TB drives will be relegated to professionals and enthusiasts with very deep pockets.
What about power consumption?
If you’re a desktop user after the best possible performance, then you probably don’t care how much juice you’re using. But for laptop and convertible tablet owners, drive efficiency is more important than speed—especially if you want all-day battery life.
Choosing an extremely efficient drive like Samsung’s 850 EVO over a faster-but-power-hungry NVMe drive (like, say, the Samsung 960 EVO) can gain you 90 minutes or more of extra unplugged run time. And higher-capacity models can draw more power than less-spacious drives, simply because there are more NAND packages on bigger drives to write your data to.
While the above advice is true in a general sense, some drives can buck trends, and technology is always advancing and changing the landscape. If battery life is key to your drive-buying considerations, be sure to consult the battery testing we do on every SSD we test.
What controller should your SSD have?
Think of the controller as the processor of your drive. It routes your reads and writes and performs other key drive performance and maintenance tasks. It can be interesting to dive deep into specific controller types and specs. But for most people, it’s enough to know that, much like PCs, more cores are better for higher-performing, higher-capacity drives.
While the controller obviously plays a big role in performance, unless you like to get into the minute details of how specific drives compare against each other, it’s better to check out our reviews to see how a drive performs overall, rather than focusing too much on the controller.
Which type of storage memory (NAND flash) do you need?
When shopping for an SSD for general computing use in a desktop or laptop, you don’t expressly need to pay attention to the type of storage that’s inside the drive. In fact, with most options on the market these days, you don’t have much a choice, anyway. But if you’re curious about what’s in those flash packages inside your drive, we’ll walk you through various types below. Some of them are far less common than they used to be, and some are becoming the de facto standard.
Single-Level Cell (SLC) flash memory came first and was the primary form of flash storage for several years. Because (as its name implies) it only stores a single bit of data per cell, it’s extremely fast and lasts a long time. But, as storage tech goes these days, it’s not very dense in terms of how much data it can store, which makes it very expensive. At this point, beyond extremely pricey enterprise drives and use as small amounts of fast cache, SLC has been replaced by newer, denser types of flash storage tech.
Multi-Layer Cell (MLC) came after SLC and for years was the storage type of choice for its ability to store more data at a lower price, despite being slower. To get around the speed issue, many of these drives have a small amount of faster SLC cache that acts as a write buffer. Today, apart from a few high-end consumer drives, MLC has been replaced by the next step in NAND storage tech, TLC.
Triple-Level Cell (TLC) flash is still very common in today’s consumer SSDs. While TLC is slower still than MLC, as its name implies, it’s even more data-dense, allowing for spacious, affordable drives. Most TLC drives (except some of the least-expensive models) also employ some sort of caching tech, because TLC on its own without a buffer often is not significantly faster than a hard drive.
For mainstream users running consumer apps and operating systems, this isn’t a problem because the drive isn’t typically written to in a sustained enough way to saturate the faster cache. But professional and pro-sumer users who often work with massive files may want to spend more for an MLC-based drive to avoid slowdowns when moving around massive amounts of data.
Quad-Level Cell (QLC) tech is emerging as the next stage of the solid-state storage revolution. And as the name implies, it should lead to less-expensive and more-spacious drives thanks to an increase in density. As of this writing, there are only a handful of consumer QLC drives on the market, including Intel’s SSD 660p and Crucial’s similar P1, as well as Samsung’s SATA-based QVO drive.
What about endurance?
These are two other areas where, for the most part, buyers looking for a drive for general-purpose computing don’t need to dive too deep, unless they want to. All flash memory has a limited life span, meaning after any given storage cell is written to a certain number of times, it will stop holding data. And drive makers often list a drive’s rated endurance in total terabytes written (TBW), or drive writes per day (DWPD).
But most drives feature “over provisioning,” which portions off part of the drive’s capacity as a kind of backup. As the years pass and cells start to die, the drive will move your data off the worn-out cells to these fresh new ones, thereby greatly extending the usable lifespan of the drive. Generally, unless you’re putting your SSD into a server or some other scenario where it’s getting written to nearly constantly (24/7), all of today’s drives are rated with enough endurance to function for at least 3-5 years, if not more.
If you plan on using your drive for much longer than that, or you know that you’ll be writing to the drive far more than the average computer user, you’ll probably want to avoid a QLC drive in particular, and invest in a model with higher-than-average endurance ratings, and/or a longer warranty. Samsung’s Pro drives, for instance, typically have high endurance ratings and long warranties. But again, the vast majority of computer users should not have to worry about a drive’s endurance.
Do you need a drive with 3D flash? And what about layers?
Here again is a question that you don’t have to worry about unless you’re curious. The flash in SSDs used to be arranged in a single layer (planar). But starting with Samsung’s 850 Pro in 2012, drive makers began stacking storage cells on top of each other in layers. Samsung calls its implementation of this tech “V-NAND” (vertical NAND), Toshiba calls it “BiCS FLASH.” Most other companies just call it what it is: 3D NAND. As time progresses, drive makers are stacking more and more layers on top of each other, leading to denser, more spacious, and less-expensive drives.
At this point, the vast majority of current-generation consumer SSDs are made using some type of 3D storage. The latest drives often use 96-layer NAND. But apart from looking at small letters on a spec sheet or box, the only reason you’re likely to notice that your drive has 3D NAND is when you see the price. Newer 3D-based drives tend to cost significantly less than their predecessors a the same capacity, because they’re cheaper to make and require fewer flash packages inside the drive for the same amount of storage.
What about 3D XPoint/Optane?
3D XPoint, (pronounced “cross point”), created in a partnership between Intel and Micron (maker of Crucial-branded SSDs), is an emerging new storage technology that has the potential to be much faster than any existing traditional flash-based SSD (think performance similar to DRAM), while also increasing endurance for longer-lasting storage.
While Micron is heavily involved in the development of 3D Xpoint, and intends to eventually bring it to market, as of this writing, Intel is the only company currently selling the technology to consumers, under its Optane brand. Optane Memory is designed to be used as a caching drive in tandem with a hard drive or a slower SATA-based SSD, while the Optane 900p (an add-in card) / 905P are standalone drives, and the Intel 800p can be used as either a caching drive or a standalone drive (though cramped capacities make it more ideal for the former).
Optane drives have much potential, both on the ultra-fast performance front and as a caching option for those who want the speed of an SSD for frequently used programs but the capacity of a spinning hard drive for media and game storage. But it’s still very much a nascent technology, with limited laptop support, low capacities and high prices. At the moment, 3D XPoint is far more interesting for what it could be in the near future than for what it offers to consumers today. However, if you have a lot of money to spend, the Intel Optane 905P is the fastest SSD around.
Now that you understand all the important details that separate SSDs and SSD types, your choices should be clear. Remember that high-end drives, while technically faster, won’t often feel speedier than less-spendy options in common tasks.
So unless you’re chasing extreme speed for professional or enthusiast reasons, it’s often best to choose an affordable mainstream drive that has the capacity you need at a price you can afford. Stepping up to any modern SSD over an old-school spinning hard drive is a huge difference that you’ll instantly notice. But as with most PC hardware, there are diminishing returns for mainstream users as you climb up the product stack.