Data in a Flash, Part I: the Evolution of Disk Storage and an Introduction to NVMe

NVMe drives have paved the way for computing at stellar speeds, but the technology didn’t suddenly appear overnight. It was through an evolutionary process that we now rely on the very performant SSD for our primary storage tier.

Solid State Drives (SSDs) have taken the computer industry by storm in recent years. The technology is impressive with its high-speed capabilities. It promises low-latency access to sometimes critical data while increasing overall performance, at least when compared to what is now becoming the legacy Hard Disk Drive (HDD). With each passing year, SSD market shares continue to climb, replacing the HDD in many sectors. The effects of this are seen in personal, mobile and server computing.

IBM first unleashed the HDD into the computing world in 1956. By the 1960s, the HDD became the dominant secondary storage device for general-purpose computers (emphasis on secondary storage device, memory being the first). Capacity and performance were the primary characteristics defining the HDD. In many ways, those characteristics continue to define the technology—although, not in the most positive ways (more details on that shortly).

The first IBM-manufactured hard drive, the 350 RAMAC, was as large as two medium-sized refrigerators with a total capacity of 3.75MB on a stack of 50 disks. Modern HDD technology has produced disk drives with volumes as high as 16TB, specifically with the more recent Shingled Magnetic Recording (SMR) technology coupled with helium—yes, that’s the same chemical element abbreviated as He in the periodic table. The sealed helium gas increases the potential speed of the drive while creating less drag and turbulence. Being less dense than air, it also allows more platters to be stacked in the same space used by 2.5″ and 3.5″ conventional disk drives.

""

Figure 1. A lineup of Standard HDDs throughout Their History and across All Form Factors (by Paul R. Potts—Provided by Author, CC BY-SA 3.0 us, https://commons.wikimedia.org/w/index.php?curid=4676174)

A disk drive’s performance typically is calculated by the time required to move the drive’s heads to a specific track or cylinder and the time it takes for the requested sector to move under the head—that is, the latency. Performance is also measured at the rate by which the data is transmitted.

Being a mechanical device, an HDD does not perform nearly as fast as memory. A lot of moving components add to latency times and decrease the overall speed by which you can access data (for both read and write operations).

""

Figure 2. Disk Platter Layout

Each HDD has magnetic platters inside, which often are referred to as disks. Those platters are what stores the information. Bound by a spindle and spinning them in unison, an HDD will have more than one platter sitting on top of each other with a minimum amount of space in between.

Similar to how a phonograph record works, the platters are double-sided, and the surface of each has circular etchings called tracks. Each track is made up of sectors. The number of sectors on each track increases as you get closer to the edge of a platter. Nowadays, you’ll find that the physical size of a sector is either 512 bytes or 4 Kilobytes (4096 bytes). In the programming world, a sector typically equates to a disk block.

The speed at which a disk spins affects the rate at which information can be read. This is defined as a disk’s rotation rate, and it’s measured at revolutions per minute (RPM). This is why you’ll find modern drives operating at speeds like 7200 RPM (or 120 rotations per second). Older drives spin at slower rates. High-end drives may spin at higher rates. This limitation creates a bottleneck.

An actuator arm sits on top of or below a platter. It extends and retracts over its surface. At the end of the arm is a read-write head. It sits at a microscopic distance above the surface of the platter. As the disk rotates, the head can access information on the current track (without moving). However, if the head needs to move to the next track or to an entirely different track, the time to read or write data is increased. From a programmer’s perspective, this is referred to as the disk seek, and this creates a second bottleneck for the technology.

Now, although HDDs’ performance has been increasing with newer disk access protocols—such as Serial ATA (SATA) and Serial Attached SCSI (SAS)—and technologies, it’s still a bottleneck to the CPU and, in turn, to the overall computer system. Each disk protocol has its own hard limits on maximum throughput (megabytes or gigabytes per second). The method in which data is transferred is also very serialized. This works well with a spinning disk, but it doesn’t scale well to Flash technologies.

Since its conception, engineers have been devising newer and more creative methods to help accelerate the HDDs’ performance (for example, with memory caching), and in some cases, they’ve completely replaced them with technologies like the SSD. Today, SSDs are being deployed everywhere—or so it seems. Cost per gigabyte is decreasing, and the price gap is narrowing between Flash and traditional spinning rust. But, how did we get here in the first place? The SSD wasn’t an overnight success. Its history is more of a gradual one, dating back as far as when the earliest computers were being developed.

A Brief History of Computer Memory

Memory comes in many forms, but before Non-Volatile Memory (NVM) came into the picture, the computing world first was introduced to volatile memory in the form of Random Access Memory (RAM). RAM introduced the ability to write/read data to/from any location of the storage medium in the same amount of time. The often random physical location of a particular set of data did not affect the speed at which the operation completed. The use of this type of memory masked the pain of accessing data from the exponentially slower HDD, by caching data read often or staging data that needed to be written.

The most notable of RAM technologies is Dynamic Random Access Memory (DRAM). It also came out of the IBM labs, in 1966, a decade after the HDD. Being that much closer to the CPU and also not having to deal with mechanical components (that is, the HDD), DRAM performed at stellar speeds. Even today, many data storage technologies strive to perform at the speeds of DRAM. But, there was a drawback, as I emphasized above: the technology was volatile, and as soon as the capacitor-driven integrated circuits (ICs) were deprived of power, the data disappeared along with it.

Another set of drawbacks to the DRAM technology is its very low capacities and the price per gigabyte. Even by today’s standards, DRAM is just too expensive when compared to the slower HDDs and SSDs.

Shortly after DRAM’s debut came Erasable Programmable Read-Only Memory (EPROM). Invented by Intel, it hit the scene at around 1971. Unlike its volatile counterparts, EPROM offered an extremely sought-out industry game-changer: memory that retains its data as soon as system power is shut off. EPROM used transistors instead of capacitors in its ICs. Those transistors were capable of maintaining state, even after the electricity was cut.

As the name implies, the EPROM was in its own class of Read-Only Memory (ROM). Data typically was pre-programmed into those chips using special devices or tools, and when in production, it had a single purpose: to be read from at high speeds. As a result of this design, EPROM immediately became popular in both embedded and BIOS applications, the latter of which stored vendor-specific details and configurations.

Moving Closer to the CPU

As time progressed, it became painfully obvious: the closer you move data (storage) to the CPU, the faster you’re able to access (and manipulate) it. The closest memory to the CPU is the processor’s registers. The amount of available registers to a processor varies by architecture. The register’s purpose is to hold a small amount of data intended for fast storage. Without a doubt, these registers are the fastest way to access small sizes of data.

Next in line, and following the CPU’s registers, is the CPU cache. This is a hardware cache built in to the processor module and utilized by the CPU to reduce the cost and time it takes to access data from the main memory (DRAM). It’s designed around Static Random Access Memory (SRAM) technology, which also is a type of volatile memory. Like a typical cache, the purpose of this CPU cache is to store copies of data from the most frequently used main memory locations. On modern CPU architectures, multiple and different independent caches exist (and some of those caches even are split). They are organized in a hierarchy of cache levels: Level 1 (L1), Level 2 (L2), Level 3 (L3) and so on. The larger the processor, the more the cache levels, and the higher the level, the more memory it can store (that is, from KB to MB). On the downside, the higher the level, the farther its location is from the main CPU. Although mostly unnoticeable to modern applications, it does introduce latency.

""

Figure 3. General Outline of the CPU and Its Memory Locations/Caches

The first documented use of a data cache built in to the processor dates back to 1969 and the IBM System/360 Model 85 mainframe computing system. It wasn’t until the 1980s that the more mainstream microprocessors started incorporating their own CPU caches. Part of that delay was driven by cost. Much like it is today, (all types of) RAM was very expensive.

So, the data access model goes like this: the farther you move away from the CPU, the higher the latency. DRAM sits much closer to the CPU than an HDD, but not as close as the registers or levels of caches designed into the IC.

""

Figure 4. High-Level Model of Data Access

The Solid-State Drive

The performance of a given storage technology was constantly gauged and compared to the speeds of CPU memory. So, when the first commercial SSDs hit the market, it didn’t take very long for both companies and individuals to adopt the technology. Even with a higher price tag, when compared to HDDs, people were able to justify the expense. Time is money, and if access to the drives saves time, it potentially can increase profits. However, it’s unfortunate that with the introduction of the first commercial NAND-based SSDs, the drive didn’t move data storage any closer to the CPU. This is because early vendors chose to adopt existing disk interface protocols, such as SATA and SAS. That decision did encourage consumer adoption, but again, it limited overall throughput.

""

Figure 5. SATA SSD in a 2.5″ Drive Form Factor

Even though the SSD didn’t move any closer to the CPU, it did achieve a new milestone in this technology—it reduced seek times across the storage media, resulting in significantly less latencies. That’s because the drives were designed around ICs, and they contained no movable components. Overall performance was night and day compared to traditional HDDs.

The first official SSD manufactured without the need of a power source (that is, a battery) to maintain state was introduced in 1995 by M-Systems. They were designed to replace HDDs in mission-critical military and aerospace applications. By 1999, Flash-based technology was designed and offered in the traditional 3.5″ storage drive form factor, and it continued to be developed this way until 2007 when a newly started and revolutionary startup company named Fusion-io (now part of Western Digital) decided to change the performance-limiting form factor of traditional storage drives and throw the technology directly onto the PCI Express (PCIe) bus. This approach removed many unnecessary communication protocols and subsystems. The design also moved a bit closer to the CPU and produced a noticeable performance improvement. This new design not only changed the technology for years to come, but it also even brought the SSD into traditional data centers.

Fusion-io’s products later inspired other memory and storage companies to bring somewhat similar technologies to the Dual In-line Memory Module (DIMM) form factor, which plugs in directly to the traditional RAM slot of the supported motherboard. These types of modules register to the CPU as a different class of memory and remain in a somewhat protected mode. Translation: the main system and, in turn, the operating system did not touch these memory devices unless it was done through a specifically designed device driver or application interface.

It’s also worth noting here that the transistor-based NAND Flash technology still paled in comparison to DRAM performance. I’m talking about microsecond latencies versus DRAM’s nanosecond latencies. Even in a DIMM form factor, the NAND-based modules just don’t perform as well as the DRAM modules.

Introducing NAND Memory

What makes an SSD faster than a traditional HDD? The simple answer is that it is memory built with chips and no moving components. The name of the technology—solid state—captures this very trait. But if you’d like a more descriptive answer, keep reading.

Instead of saving data onto spinning disks, SSDs save that same data to a pool of NAND flash. The NAND (or NOT-AND) technology is made up of floating gate transistors, and unlike the transistor designs used in DRAM (which must be refreshed multiple times per second), NAND is capable of retaining its charge state, even when power is not supplied to the device—hence the non-volatility of the technology.

At a much lower level, in a NAND configuration, electrons are stored in the floating gate. Opposite of how you read boolean logic, a charge is signified as a “0”, and a not-charge is a “1”. These bits are stored in a cell. It is organized in a grid layout referred to as a block. Each individual row of the grid is called a page, with page sizes typically set to 4K (or more). Traditionally, there are 128–256 pages per block, with block sizes reaching as high as 1MB or larger.

""

Figure 6. NAND Die Layout

There are different types of NAND, all defined by the number of bits per cell. As the name implies, a single-level cell (SLC) stores one bit. A multi-level cell stores two bits. Triple-level cells store three bits. And, new to the scene is the QLC. Guess how many bits it can store? You guessed it: four.

Now, although a TLC offers more storage density than an SLC NAND, it comes at a price: increased latency—that is, approximately four times worse for reads and six times worse for writes. The reason for this rests on how data moves in and out of the NAND cell. In an SLC NAND, the device’s controller needs to know only if the bit is a 0 or a 1. With an MLC, the cell holds more values—four to be exact: 00, 01, 10 or 11. In a TLC NAND, it holds eight values: 000, 001, 010, 011, 100, 101, 110, 111. That’s a lot of overhead and extra processing. Either way, regardless of whether your drive is using SLC or TLC NAND, it still will perform night-and-day faster than an HDD—minor details.

There’s a lot more to share about NAND, such as how reads, writes and erases (Programmable Erase or PE cycles) work, the last of which does eventually impact write performance and some of the technology’s early pitfalls, but I won’t bore you with that. Just remember: electrical charges to chips are much faster than moving heads across disk platters. It’s time to introduce the NVMe.

The Boring Details

Okay, I lied. Write performance can and will vary throughout the life of the SSD. When an SSD is new, all of its data blocks are erased and presented as new. Incoming data is written directly to the NAND. Once the SSD has filled all of the free data blocks on the device, it then must erase previously programmed blocks to write the new data. In the industry, this moment is known as the device’s write cliff. To free the old blocks, the chosen blocks must be erased. This action is called the Programmable Erase (PE) cycle, and it increases the device’s write latency. Given enough time, you’ll notice that a used SSD eventually doesn’t perform as well as a brand-new SSD. A NAND cell is programmed to handle a finite amount of erases.

To overcome all of these limitations and eventual bottlenecks, vendors resort to various tricks, including the following:

  • The over-provisioning of NAND: although a device may register 3TB of storage, it may in fact be equipped with 6TB.
  • The coalescing of write data to reduce the impacts of write amplification.
  • Wear leveling: reduce the need of writing and rewriting to the same regions of the NAND.

Non-Volatile Memory Express (NVMe)

Fusion-io built a closed and proprietary product. This fact alone brought many industry leaders together to define a new standard to compete against the pioneer and push more PCIe-connected Flash into the data center. With the first industry specifications announced in 2011, NVMe quickly rose to the forefront of SSD technologies. Remember, historically, SSDs were built on top of SATA and SAS buses. Those interfaces worked well for the maturing Flash memory technology, but with all the protocol overhead and bus speed limitations, it didn’t take long for those drives to experience their own fair share of performance bottlenecks (and limitations). Today, modern SAS drives operate at 12Gbit/s, while modern SATA drives operate at 6Gbit/s. This is why the technology shifted its focus to PCIe. With the bus closer to the CPU, and PCIe capable of performing at increasingly stellar speeds, SSDs seemed to fit right in. Using PCIe 3.0, modern drives can achieve speeds as high as 40Gbit/s. Support for NVMe drives was integrated into the Linux 3.3 mainline kernel (2012).

""

Figure 7. A PCIe NVMe SSD (by Dsimic – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=41576100)

What really makes NVMe shine over the operating system’s legacy storage stacks is its simpler and faster queueing mechanisms. These are called the Submission Queues (SQs) and Completion Queues (CQs). Each queue is a circular buffer of a fixed size that the operating system uses to submit one or more commands to the NVMe controller. One or more of these queues also can be pinned to specific cores, which allows for more uninterrupted operations. Goodbye serial communication. Drive I/O is now parallelized.

Non-Volatile Memory Express over Fabric (NVMeoF)

In the world of SAS or SATA, there is the Storage Area Network (SAN). SANs are designed around SCSI standards. The primary goal of a SAN (or any other storage network) is to provide access of one or more storage volumes across one or more paths to a single or multiple operating system host(s) in a network. Today, the most commonly deployed SAN is based on iSCSI, which is SCSI over TCP/IP. Technically, NVMe drives can be configured within a SAN environment, although the protocol overhead introduces latencies that make it a less than ideal implementation. In 2014, the NVMe Express committee was poised to rectify this with the NVMeoF standard.

The goals behind NVMeoF are simple: enable an NVMe transport bridge, which is built around the NVMe queuing architecture, and avoid any and all protocol translation overhead other than the supported NVMe commands (end to end). With such a design, network latencies noticeably drop (less than 200ns). This design relies on the use of PCIe switches. A second design has been gaining ground that’s based on the existing Ethernet fabrics using Remote Direct Memory Access (RDMA).

""

Figure 8. A Comparison of NVMe Fabrics over Other Storage Networks

The 4.8 Linux kernel introduced a lot of new code to support NVMeoF. The patches were submitted as part of a joint effort by the hard-working developers over at Intel, Samsung and elsewhere. Three major components were patched into the kernel, including the general NVMe Target Support framework. This framework enables block devices to be exported from the Linux kernel using the NVMe protocol. Dependent upon this framework, there is now support for NVMe loopback devices and also NVMe over Fabrics RDMA Targets. If you recall, this last piece is one of the two more common NVMeoF deployments.

Conclusion

So, there you have it, an introduction and deep dive into Flash storage. Now you should understand why the technology is both increasing in popularity and the preferred choice for high-speed computing. Part II of this article shifts focus to using NVMe drives in a Linux environment and accessing those same NVMe drives across an NVMeoF network.

Read More

Western Digital -5.6% on EPS miss, weak revenue mix

Western Digital (NASDAQ:WDC) -5.6% after reporting in-line Q3 revenue and an EPS miss. Results were weighed down with $110M in inventory charges in the cost of revenue primarily due to flash memory products containing DRAM.

Peer Seagate (NASDAQ:STX) is down 1.1%.

Revenue breakdown: Client Devices, $1.63B (last year: $2.31B); Client Solutions, $804M ($1.04B); Data Center and Devices, $1.25B ($1.66B).

HDD units were 27.8M (consensus: 28.06M) with Client compute at 12.9M (12.54M), Non-compute at 9.3M (9.58M), and data center at 5.6M (5.94M).

ASP was $73 versus the $67.02 consensus.

Gross margin was 25.3%, below the 27. 9% estimate.

Read More

The spin doctors: Researchers discover surprising quantum effect in hard disk drive material

Scientists find surprising way to affect information storage properties in metal alloy.

Sometimes scientific discoveries can be found along well-trodden paths. That proved the case for a cobalt-iron alloy material commonly found in hard disk drives.

As reported in a recent issue of Physical Review Letters, researchers from the U.S. Department of Energy’s (DOE) Argonne National Laboratory, along with Oakland University in Michigan and Fudan University in China, have found a surprising quantum effect in this alloy.

The effect involves the ability to control the direction of electron spin, and it could allow scientists to develop more powerful and energy-efficient materials for information storage. By changing the electron spin direction in a material, the researchers were able to alter its magnetic state. This greater control of magnetization allows more information to be stored and retrieved in a smaller space. Greater control could also yield additional applications, such as more energy-efficient electric motors, generators and magnetic bearings.

The effect the researchers discovered has to do with “damping,” in which the direction of electron spin controls how the material dissipates energy. “When you drive your car down a flat highway with no wind, the dissipating energy from drag is the same regardless of the direction you travel,” said Argonne materials scientist Olle Heinonen, an author of the study. “With the effect we discovered, it’s like your car experiences more drag if you’re traveling north-south than if you’re traveling east-west.”

“In technical terms, we discovered a sizable effect from magnetic damping in nanoscale layers of cobalt-iron alloy coated on one side of a magnesium oxide substrate,” added Argonne materials scientist Axel Hoffmann, another author of the study. “By controlling the electron spin, magnetic damping dictates the rate of energy dissipation, controlling aspects of the magnetization.”

The team’s discovery proved especially surprising because the cobalt-iron alloy had been widely used in applications such as magnetic hard drives for many decades, and its properties have been thoroughly investigated. It was conventional wisdom that this material did not have a preferred direction for electron spin and thus magnetization.

In the past, however, scientists prepared the alloy for use by “baking” it at high temperature, which orders the arrangement of the cobalt and iron atoms in a regular lattice, eliminating the directional effect. The team observed the effect by examining unbaked cobalt-iron alloys, in which cobalt and iron atoms can randomly occupy each other’s sites.

The team was also able to explain the underlying physics. In a crystal structure, atoms normally sit at perfectly regular intervals in a symmetric arrangement. In the crystal structure of certain alloys, there are slight differences in the separation between atoms that can be removed through the baking process; these differences remain in an “unbaked” material.

Squeezing such a material at the atomic level further changes the separation of the atoms, resulting in different interactions between atomic spins in the crystalline environment. This difference explains how the damping effect on magnetization is large in some directions, and small in others.

The result is that very small distortions in the atomic arrangement within the crystalline structure of cobalt-iron alloy have giant implications for the damping effect. The team ran calculations at the Argonne Leadership Computing Facility, a DOE Office of Science User Facility, that confirmed their experimental observations. 

The researchers’ work appears in the March 21 online edition of Physical Review Lettersand is entitled, “Giant anisotropy of Gilbert damping in epitaxial CoFe films.”

Read More

NAND Flash Industry in 2019 Has Huge Variables

Back in 1Q18, some companies such as Samsung, Toshiba, Micron and so on have a 40% profit margin on NAND chips, but prices began to fall sharply after the first quarter, according to a report from Mari Technology Information co., ltd published on February 16, 2019.

It’s estimated that prices in NAND flash fell by at least 50% in 1H18. According to analysts, prices will fall by 30% annually in the future until the next round of price increases.

The NAND market in 2018 has changed from prosperity to depression, a trend that will continue this year. However, the variables in flash memory market in 2019 are not just price reduction because new technologies, new products and the addition of Chinese manufacturers will bring great changes to this market.
 
Insufficient demand inevitably leads to price reduction
A key reason for the sharp price reduction in 2018 was the mass production of 64-tier 3D NAND flash. The transition from 32/48-tier to 64-tier reduced the cost of NAND flash memory to 8 cents/GB compared with the 21 cents of previous 2D NAND flash/GB, making it possible for NAND to reduce cost.

Under the pressure of oversupply, NAND flash market has been adjusted several times in the past few years. The market began to decline during 2015 and 2016 while in 2017 the market began to rise again. However, the situation did not continue as expected. When large factories competed to strengthen investment scale and supported new capacity of 3D NAND flash, market demand grew slowly, which also accelerated the price reduction of NAND flash in 2018 in the world.

Affected by the bottleneck of technology and yield, the improvement rate of 3D NAND yield in 2018 is not as smooth as expected. As a result, secondary products are sold in circulation, further interfering with market prices. As for terminal applications, the client SSD market is the first to bear the brunt.

The giants repay their debts for the good they once had
The hard days are not over yet. The price of NAND flash is expected to drop another 50% this year. NAND and SSD manufacturers will pay off the debt for the previous two years.

Recently, Western Digital’s financial reports have shown that the price reduction of NAND flash has a great impact on it. Its revenue and profit of this quarter declined Y/Y. Although NAND flash capacity shipments increased by 28%, the price of NAND flash fell by 16%, resulting in a sharp decline of gross interest rate. The HDD business is more serious since the shipment is 34.1 million, declined 8.1 million year-on-year.

The price reduction of NAND flash will also affect the earnings of Micron, SK Hynix, Samsung and Toshiba. However, Samsung, Micron and SK Hynix have the support of DRAM memory with high price, so their earnings will not be as bad as WD. In order to reduce the impact, WD and Micron are reducing NAND flash capacity investment to reduce NAND supply and alleviate market expectations of price reduction.
 
64-tier and 96-tier 3D NAND Flash have different destinies
This year will be the first year of the explosion of the 96-tier 3D NAND flash. After 64-tier stack, major manufacturers began to mass-produce 96-tier stack 3D flash in 2Q18. Samsung announced the fifth generation of NAND flash (one of 3D NAND flash) in early July. V NAND flash took the lead in supporting Toggle DDR 4.0 interface, with transmission speed reaching 1.4Gb/s, 40% higher than 64-tier V NAND flash. Its working voltage is reduced from 1.8V to 1.2V, and the writing speed is the fastest at present with only 500μs, which is 30% higher than the previous generation.

In addition, the fifth generation of NAND flash has also been optimized in manufacturing process. The manufacturing efficiency has increased by 30% and advanced technology reduces the height of each unit by 20%, which reduces the interference among units and improves the efficiency of data processing.

Micron, Intel, Toshiba, WD and SK Hynix have also announced their own 96-tier 3D NAND flash scheme, among which, WD and Toshiba use a new generation of BiCS4 technology in 96-tier 3D flash. Core capacity of their QLC is up to 1.33Tb, 33% higher than the industry standard. Toshiba has developed 16-core single chip flash. Only a flash has a capacity of 2.66TB.

Emergence of domestic manufacturers changes global markets
When the global semiconductor market in 2018 reached $150 billion, of which NAND flash exceeded $57 billion, China consumed 32% of the global capacity, which means that the country has become the major global market. In order to get rid of the dependence of long-term external procurement, the independent development of domestic memory chips has become an urgent task.

There is another variable in the NAND market in 2019. Although it is still a rudimentary, it is most likely to reshape the market structure of memory chips. That is, YMTC in China will produce large-scale 3D NAND flash in 2019, competing with Samsung, Toshiba, Micron and other international NAND manufacturers.

32-tier 3D NAND flash of YMTC has been successfully released and entered small-scale production. However, the 32-tier stacking process is not competitive. Recently, YMTC’s 64-tier NAND samples with Xtacking architecture have been sent to relevant supply chains for testing.

If the schedule is in line with expectations, production will be expected as soon as the 3Q19. At that time, there will be an opportunity to turn a loss into a profit.

In addition, YMTC also plans to skip from 96-tier 3D NAND to 128-tier 3D NAND in 2020. With the upgrading of production technology and the planned production capacity of 300,000 to 450,000 pieces, there will be an opportunity for the firm to grab about 10% of global market share in the future.

At the same time, UNIS promotes construction progress in other cities, for example Nanjing and Chengdu factories successively enters the construction stage by the end of 2018. A total of $26.87 billion will be invested in three major production bases to produce 3D NAND chips. On the other hand, UNIS desires to cooperate with Intel to develop NAND flash technology at full speed. 

It is only a matter of time for domestic manufacturers to enter the NAND flash market. Although it is still in the test stage in 2019, it is still necessary to solve the problem of yield after increasing output gradually. In the process of technological transformation, whether the products with unqualified yield will affect the market order will be worth observing.

Industrial variables are huge in this year
It’s reported that NAND flash price will fall 10% to 15% in the first quarter of 2019. In response, in the latest report analysts at foreign Citibank maintained neutral frequencies on Micron’s shares, but lowered Micron’s revenue and earnings expectations for 2019 for the reason the overall memory market in 2019 will face a major price reduction.

Due to overcapacity and increasing inventory, NAND flash and DRAM are expected to have a price reduction in 2019. NAND flash prices will fall 45%, and DRAM prices will fall 30%.Moreover, such a price cut will not see the bottom line until 2Q19, suggesting that the price reduction this year will last at least two quarters.

On the supply side, the yield of 64-tier 3D NAND flash has reached a mature stage. Coupled with the input of new production capacity, even if the production time of 96-tier 3D NAND flash is delayed, it still cannot withstand the increasing output of 64-tier 3D NAND flash. Unlike memory products can be used in cache, flash is the main storage device for various electronic products. The price reduction is often accompanied by an increase of carrying capacity.

Demand-side growth is not keeping pace with output growth, so the whole industry will continue to oversupply until the end of 2019.

Read More

NAND Flash Industry in 2019 Has Huge Variables

Back in 1Q18, some companies such as Samsung, Toshiba, Micron and so on have a 40% profit margin on NAND chips, but prices began to fall sharply after the first quarter, according to a report from Mari Technology Information co., ltd published on February 16, 2019.

It’s estimated that prices in NAND flash fell by at least 50% in 1H18. According to analysts, prices will fall by 30% annually in the future until the next round of price increases.

The NAND market in 2018 has changed from prosperity to depression, a trend that will continue this year. However, the variables in flash memory market in 2019 are not just price reduction because new technologies, new products and the addition of Chinese manufacturers will bring great changes to this market.

Insufficient demand inevitably leads to price reduction
A key reason for the sharp price reduction in 2018 was the mass production of 64-tier 3D NAND flash. The transition from 32/48-tier to 64-tier reduced the cost of NAND flash memory to 8 cents/GB compared with the 21 cents of previous 2D NAND flash/GB, making it possible for NAND to reduce cost.

Under the pressure of oversupply, NAND flash market has been adjusted several times in the past few years. The market began to decline during 2015 and 2016 while in 2017 the market began to rise again. However, the situation did not continue as expected. When large factories competed to strengthen investment scale and supported new capacity of 3D NAND flash, market demand grew slowly, which also accelerated the price reduction of NAND flash in 2018 in the world.

Affected by the bottleneck of technology and yield, the improvement rate of 3D NAND yield in 2018 is not as smooth as expected. As a result, secondary products are sold in circulation, further interfering with market prices. As for terminal applications, the client SSD market is the first to bear the brunt.

The giants repay their debts for the good they once had
The hard days are not over yet. The price of NAND flash is expected to drop another 50% this year. NAND and SSD manufacturers will pay off the debt for the previous two years.

Recently, Western Digital’s financial reports have shown that the price reduction of NAND flash has a great impact on it. Its revenue and profit of this quarter declined Y/Y. Although NAND flash capacity shipments increased by 28%, the price of NAND flash fell by 16%, resulting in a sharp decline of gross interest rate. The HDD business is more serious since the shipment is 34.1 million, declined 8.1 million year-on-year.

The price reduction of NAND flash will also affect the earnings of Micron, SK Hynix, Samsung and Toshiba. However, Samsung, Micron and SK Hynix have the support of DRAM memory with high price, so their earnings will not be as bad as WD. In order to reduce the impact, WD and Micron are reducing NAND flash capacity investment to reduce NAND supply and alleviate market expectations of price reduction.

64-tier and 96-tier 3D NAND Flash have different destinies
This year will be the first year of the explosion of the 96-tier 3D NAND flash. After 64-tier stack, major manufacturers began to mass-produce 96-tier stack 3D flash in 2Q18. Samsung announced the fifth generation of NAND flash (one of 3D NAND flash) in early July. V NAND flash took the lead in supporting Toggle DDR 4.0 interface, with transmission speed reaching 1.4Gb/s, 40% higher than 64-tier V NAND flash. Its working voltage is reduced from 1.8V to 1.2V, and the writing speed is the fastest at present with only 500μs, which is 30% higher than the previous generation.

In addition, the fifth generation of NAND flash has also been optimized in manufacturing process. The manufacturing efficiency has increased by 30% and advanced technology reduces the height of each unit by 20%, which reduces the interference among units and improves the efficiency of data processing.

Micron, Intel, Toshiba, WD and SK Hynix have also announced their own 96-tier 3D NAND flash scheme, among which, WD and Toshiba use a new generation of BiCS4 technology in 96-tier 3D flash. Core capacity of their QLC is up to 1.33Tb, 33% higher than the industry standard. Toshiba has developed 16-core single chip flash. Only a flash has a capacity of 2.66TB.

Emergence of domestic manufacturers changes global markets
When the global semiconductor market in 2018 reached $150 billion, of which NAND flash exceeded $57 billion, China consumed 32% of the global capacity, which means that the country has become the major global market. In order to get rid of the dependence of long-term external procurement, the independent development of domestic memory chips has become an urgent task.

There is another variable in the NAND market in 2019. Although it is still a rudimentary, it is most likely to reshape the market structure of memory chips. That is, YMTC in China will produce large-scale 3D NAND flash in 2019, competing with Samsung, Toshiba, Micron and other international NAND manufacturers.

32-tier 3D NAND flash of YMTC has been successfully released and entered small-scale production. However, the 32-tier stacking process is not competitive. Recently, YMTC’s 64-tier NAND samples with Xtacking architecture have been sent to relevant supply chains for testing.

If the schedule is in line with expectations, production will be expected as soon as the 3Q19. At that time, there will be an opportunity to turn a loss into a profit.

In addition, YMTC also plans to skip from 96-tier 3D NAND to 128-tier 3D NAND in 2020. With the upgrading of production technology and the planned production capacity of 300,000 to 450,000 pieces, there will be an opportunity for the firm to grab about 10% of global market share in the future.

At the same time, UNIS promotes construction progress in other cities, for example Nanjing and Chengdu factories successively enters the construction stage by the end of 2018. A total of $26.87 billion will be invested in three major production bases to produce 3D NAND chips. On the other hand, UNIS desires to cooperate with Intel to develop NAND flash technology at full speed.

It is only a matter of time for domestic manufacturers to enter the NAND flash market. Although it is still in the test stage in 2019, it is still necessary to solve the problem of yield after increasing output gradually. In the process of technological transformation, whether the products with unqualified yield will affect the market order will be worth observing.

Industrial variables are huge in this year
It’s reported that NAND flash price will fall 10% to 15% in the first quarter of 2019. In response, in the latest report analysts at foreign Citibank maintained neutral frequencies on Micron’s shares, but lowered Micron’s revenue and earnings expectations for 2019 for the reason the overall memory market in 2019 will face a major price reduction.

Due to overcapacity and increasing inventory, NAND flash and DRAM are expected to have a price reduction in 2019. NAND flash prices will fall 45%, and DRAM prices will fall 30%.Moreover, such a price cut will not see the bottom line until 2Q19, suggesting that the price reduction this year will last at least two quarters.

On the supply side, the yield of 64-tier 3D NAND flash has reached a mature stage. Coupled with the input of new production capacity, even if the production time of 96-tier 3D NAND flash is delayed, it still cannot withstand the increasing output of 64-tier 3D NAND flash. Unlike memory products can be used in cache, flash is the main storage device for various electronic products. The price reduction is often accompanied by an increase of carrying capacity.

Demand-side growth is not keeping pace with output growth, so the whole industry will continue to oversupply until the end of 2019.

Read More

GM’s Cruise is preparing for a self-driving future in the cloud

According to marketing firm ABI, as many as 8 million driverless cars will be added to the road in 2025. Meanwhile, Research and Markets are predicting that in the U.S. alone, there will be some 20 million autonomous cars in operation by 2030.

How realistic are those numbers?

If you ask Adrian Macneil, not especially. And he should know — he’s the director of engineering at Cruise, the self-driving startup that General Motors acquired for nearly $1 billion in 2016. “I think the best way that I’ve heard this described [is], the entire industry is basically in a race to the starting line,” Macneil told VentureBeat in a phone interview. “The penetration of driving the majority of miles with autonomous miles isn’t going to happen overnight.”

Cruise is considered a pack leader in a global market that’s anticipated to hit revenue of $173.15 billion by 2023. Although it hasn’t yet launched a driverless taxi service (unlike competitors Waymo, Yandex, and Drive.ai) or sold cars to customers, it’s driven more miles than most — around 450,000 in California last year, according to a report it filed with the state’s Department of Motor Vehicles. That’s behind only Waymo, which drove 1.2 million miles. Moreover, it’s repeatedly promised to launch a commercial service this year that would feature as many as 2,600 driverless cars without steering wheels, brake pedals, and accelerators.

But it’s been a long and winding path for Cruise since its humble beginnings five years ago, to put it mildly. To get a sense of how far Cruise has come and where it’s going, we spoke with Macneil about Cruise’s ongoing efforts to train cars synthetically, why the company is targeting San Francisco as one of several potential launch cities, and how Cruise fits into the broader self-driving landscape.

Rapid growth

Cruise Automation chief technology officer Kyle Vogt — who held the role of CEO until January, when former GM president Dan Ammann took over — cofounded Cruise with Dan Kan in 2013. Vogt, an MIT computer science graduate and a founding employee of Justin.tv (which became Twitch), started a number of companies prior to Cruise, including Socialcam, a mobile social video app that was acquired by Autodesk for $60 million in 2012. (Amazon purchased Twitch in 2016 for $970 million.)

Vogt can trace his passion for robotics back to childhood. By age 14, he built a Power Wheels car that could drive using computer vision. And while an undergraduate at MIT, he competed with a team in the 2004 Defense Advanced Research Projects Agency (DARPA) Grand Challenge, a $1 million competition to develop a car that could autonomously navigate a route from Barstow, California to Primm, Nevada.

Above: GM: Fourth generation vehicle, the Cruise AV.

Roughly a year after Cruise joined Y Combinator, Vogt teamed up with Dan Kan — the younger brother of Justin.tv’s Justin Kan — and it wasn’t long before they and a small team of engineers had a prototype: the RP-1. The $10,000 direct-to-consumer aftermarket kit retrofitted the Audi A4 and S4 with highway self-driving features (much like the open source stack developed by George Hotz’s Comma.ai), with the goal of supporting additional vehicles down the line.

But at a certain point, they decided to pivot toward building a more ambitious platform that could conquer city driving. Cruise announced in January 2014 that it would abandon the RP-1 in favor of a system built on top of the Nissan Leaf, and in June 2015, it received a permit to test its tech from the California Department of Motor Vehicles.

GM acquired Cruise shortly afterward, in March 2016. Back then, Cruise had roughly 40 employees, a number that quickly ballooned to 100. Cruise had 200 as of June 2017, and it plans to hire over 2,000 new workers — double its current workforce — by 2021.

Growth hasn’t slowed in the intervening months. In May 2018, Cruise — which remains an independent division within GM — announced that SoftBank’s Vision Fund would invest $2.25 billion in the company, along with another $1.1 billion from GM itself. And in October 2018, Honda pledged $750 million, to be followed by another $2 billion in the next 12 years. Today, Cruise has an estimated valuation of $14.6 billion, and the company recently expanded to a larger office in San Francisco and committed to opening an engineering hub in Seattle.

Along the way, Cruise acquired Zippy.ai, a startup developing autonomous robots for last-mile grocery and package delivery, and more recently snatched up Strobe, a provider of “chip-scale” lidar technology. Cruise says that the latter’s hardware will enable it to reduce the cost of each sensor on its self-driving cars by 99%.

Simulating cities

Cruise runs lots of simulations across its suite of internal tools — about 200,000 hours of compute jobs each day in Google Cloud Platform (25 times the number of hours 12 months ago) — one of which is an end-to-end, three-dimensional Unreal Engine environment that Cruise employees call “The Matrix.” Macneil says it enables engineers to build any kind of situation they’re able to dream up, and to synthesize sensor inputs like camera footage and radar feeds to autonomous virtual cars.

According to Macneil, Cruise spins up 30,000 instances daily across over 300,000 processor cores and 5,000 graphics cards, each of which loops through a single drive’s worth of scenarios and generates 300 terabytes of results. It’s basically like having 30,000 virtual cars driving around in parallel, he explained, and it’s a bit like Waymo’s Carcraft and the browser-based framework used by Uber’s Advanced Technology Group.

GM Cruise

Above: A screenshot taken from within “The Matrix,” Cruise’s end-to-end simulation tool.Image Credit: GM

“[The Matrix] is really good for understanding how the entire car behaves [and] also how it behaves in situations that we would not encounter frequently in the real world,” said Macneil. “So if we want to find out what happens, say, if a small object jumps in front of a car or something, we can create those kinds of simulations and reliably reproduce them. If every time you have a software release you deploy to the car and then go out and drive 100,000 or 1,000,000 miles, you’re going to be waiting quite a long time for feedback.”

Another testing approach Cruise employs is replay, which involves extracting real-world sensor data, playing it back against the car’s software, and comparing the performance with human-labeled ground truth data. Yet another is a planning simulation, which lets Cruise create up to hundreds of thousands of variations of a scenario by tweaking variables like the speed of oncoming cars and the space between them.

“We understand how, for example, if we take an updated version of the codebase and play back a construction zone, we can actually compare the results … We can really drill down to a really deep level and understand what our car’s behavior will be,” Macneil said. “If we take something like an unprotected left turn, which is a pretty complicated situation … we can [see how changes] affect how quickly our cars are able to identify [gaps between cars] and whether they choose to take that gap or not.”

Cruise doesn’t measure the number of simulated miles it’s driven, and that’s a conscious decision — Macneil says they prefer to place emphasis on the “quality” of miles rather than the total. “We think more about how the tests that are running hundreds of times a day [are covering a] range of scenarios,” he said. “It’s about more than just racking up a lot of miles — it’s about the exposure to different environments that you’re getting from those miles.”

But while its training data remains closely guarded, some of Cruise’s libraries and tools have begun to trickle into open source. In February, it released Worldview, a graphics stack of 2D and 3D scenes with accompanying mouse and movement controls, click interaction, and a suite of built-in commands. In the coming weeks, it will publish a full-featured visualization tool that’ll allow developers to drill into real-world and simulation data to better understand how autonomous systems — whether cars or robots — respond in certain situations.

Cruise control

In the real world, Cruise uses third-generation Chevrolet Bolt all-electric cars equipped with lidar sensors from Velodyne, as well as short- and long-range radar sensors, articulating radars, video cameras, fault-tolerant electrical and actuation systems, and computers running proprietary control algorithms engineered by Cruise. They also sport in-vehicle displays that show information about upcoming turns, merges, traffic light status, and other information, as well as brief explanations of pauses. Most are assembled in a billion-dollar Lake Orion, Michigan plant (in which GM further invested $300 million last month) that’s staffed by 1,000 people and hundreds of robots.

Cruise is testing them in Scottsdale, Arizona and the metropolitan Detroit area, with the bulk of deployment concentrated in San Francisco. It’s scaled up rapidly, growing its starting fleet of 30 driverless vehicles to about 130 by June 2017. Cruise hasn’t disclosed the exact total publicly, but the company has 180 self-driving cars registered with California’s DMV, and three years ago, documents obtained by IEEE Spectrum suggested the company planned to deploy as many as 300 test cars around the country.

GM Cruise

Above: A replay test performed with GM’s Worldview platform.Image Credit: GM

Currently, Cruise operates an employees-only ride-hailing program in San Francisco called Cruise Anywhere that allows the lucky few who make it beyond the waitlist to use an app to get around all mapped areas of the city where its fleet operates. The Wall Street Journal reported that Cruise and GM hope to put self-driving taxis into usage tests with ride-sharing company Lyft, with the eventual goal of creating an on-demand network of driverless cars.

Building on the progress it’s made so far, Cruise earlier this year announced a partnership with DoorDash to pilot food and grocery delivery in San Francisco this year for select customers. And it’s making progress toward its fourth-generation car, which features automatic doors, rear seat airbags, and other redundant systems, and it lacks a steering wheel.

Testing and safety

Why the focus on San Francisco? Cruise argues that in densely populated cities, difficult maneuvers (like crossing into multiple lanes of oncoming traffic) happen quite often. Moreover, it points out that San Francisco offers more people, cars, and cyclists to contend with — about 17,246 people per square mile, or five times greater density than in Phoenix.

“Testing in the hardest places first means we’ll get to scale faster than starting with the easier ones,” Vogt explained in a blog post. “Based on our experience, every minute of testing in San Francisco is about as valuable as an hour of testing in the suburbs.”

For instance, Cruise’s Bolts encounter emergency vehicles almost 47 times as frequently in San Francisco as in more suburban environments like Scottsdale and Phoenix, and road construction 39 times more often, cyclists 16 times as often, and pedestrians 32 times as often. They’ve navigated in and around six-way intersections with flashing red lights in all directions and people moving pallets through the street of Chinatown, not to mention bicyclists who cut into traffic without the right of way and construction zones delineated by cones or flares.

GM Cruise

Above: The mobile app for Cruise Anywhere.Image Credit: GM

“Just driving along in a stretch of road, whether it’s in the real world or in simulation, is not going to give you a huge amount of data,” said Macneil. “One of the reasons why we exist in San Francisco is because we encounter pedestrians, cyclists, construction zones, emergency medical, and all of these things just way more [often] … It’s critically important that we’re testing our cars and combing our real-world driving with our simulations, and with both of those looking to get a lot of coverage of what type of situations they’re encountering.”

The data seems to bear out that assertion. Last year, Cruise logged 5,205 miles between disengagements (instances when a safety driver intervened) in California, a substantial improvement over 2017’s 1,254 miles per disengagement.

Here’s how its average of 0.19 disengagements per 1,000 miles compared with others:

  • Waymo: 0.09 disengagements per 1,000 miles
  • Zoox: 0.50 disengagements per 1,000 miles
  • Nuro: 0.97 disengagements per 1,000 miles
  • Pony.ai: 0.98 disengagements per 1,000 miles

Assuming Cruise’s tech works as promised, it could be a godsend for the millions of people who risk their lives every time they step into a car. About 94% of car crashes are caused by human error, and in 2016, the top three causes of traffic fatalities were distracted driving, drunk driving, and speeding.

But will it be enough to convince a skeptical public?

Three separate studies last summer — by the Brookings Institution, think tank HNTB, and the Advocates for Highway and Auto Safety (AHAS) — found that a majority of people aren’t convinced of driverless cars’ safety. More than 60% said they were “not inclined” to ride in self-driving cars, almost 70% expressed “concerns” about sharing the road with them, and 59% expected that self-driving cars will be “no safer” than human-controlled cars.

GM Cruise

They have their reasons. In March 2018, Uber suspended testing of its autonomous Volvo XC90 fleet after one of its cars struck and killed a pedestrian in Tempe, Arizona. Separately, Tesla’s Autopilot driver-assistance system has been blamed for a number of fender benders, including one in which a Tesla Model S collided with a parked Culver City fire truck. (Tesla temporarily stopped offering “full self-driving capability” on select new models in early October 2018.)

The Rand Corporation estimates that autonomous cars will have to rack up 11 billion miles before we’ll have reliable statistics on their safety — far more than the roughly 2 million miles the dozens of companies testing self-driving cars in California logged last year. For his part, Macneil believes we’re years away from fully autonomous cars that can drive in most cities without human intervention, and he says that even when the industry does reach that point, it’ll be the first of many iterations to come.

“When you put the rates of improvement at the macro scale and you look at the entire industry, once we get the full self-driving cars on the road that have no safety driver in them and serving passengers, that’s just the first version, right?” he said. “There’s still an endless array of different weather conditions to handle, and different speeds, different situations, long-distance driving, and driving in snow and rain.”

Competition and unexpected detours

For all of its successes so far, Cruise has had its fair share of setbacks.

It backtracked on plans to test a fleet of cars in a five-mile square section in Manhattan, and despite public assurances that its commercial driverless taxi service remains on track, it’s declined to provide timelines and launch sites.

In more disappointing news for Cruise, the firm drove less than 450,000 collective miles all of last year in California, falling far short of its projected one million miles a month. (Cruise claims that the initial target was based on “expanding [its] resources equally across all of [its] testing locations,” and says that it’s instead chosen to prioritize its resources in complex urban environments.) For the sake of comparison, Alphabet’s Waymo, which was founded about four years before Cruise, has logged more than 10 million autonomous miles to date.

In a report last year citing sources “with direct knowledge of Cruise’s technology,” The Information alleged that Cruise’s San Francisco vehicles are still repeatedly involved in accidents or near-accidents and that it’s likely a decade before they come into wide use in major cities. Anecdotally, one VentureBeat reporter experienced a close call while crossing the road in front of a Cruise test vehicle in San Francisco.

Then, there’s the competition to consider.

Cruise faces the likes of Ike and Ford, the latter of which is collaborating with Postmates to deliver items from Walmart stores in Miami-Dade County, Florida. There’s also TuSimple, a three-year-old autonomous truck company with autonomous vehicles operating in Arizona, California, and China, as well as venture-backed Swedish driverless car company Einride. Meanwhile, Paz Eshel and former Uber and Otto engineer Don Burnette recently secured $40 million for startup Kodiak Robotics. That’s not to mention Embark, which integrates its self-driving systems into Peterbilt semis (and which launched a pilot with Amazon to haul cargo), as well as Tesla, Aptiv, May Mobility, Pronto.ai, Aurora, NuTonomy, Optimus Ride, Daimler, and Baidu, to name a few others.

Vogt believes that Cruise’s advantage lies in its distributed real-world and simulated training process, which he claims will enable it to launch in multiple cities simultaneously. In a GM investor meeting last year, Vogt conceded that the cars might not match human drivers in terms of capability — at least not at first. But he said that they should quickly catch up and then surpass them.

“Building a new vehicle that has an incredible user experience, optimal operational parameters, and efficient use of space is the ultimate engineering challenge,” he wrote in a recent Medium post. “We’re always looking for ways to accelerate the deployment of self-driving technology, since it’s inherently good in many different ways … We’re going to do this right.”

Read More

India’s Mfine raises $17.2M for its digital healthcare service

Mfine, an India-based startup aiming to broaden access to doctors and healthcare using the internet, has pulled in a $17.2 million Series B funding round for growth.

The company is led by four co-founders from Myntra,  the fashion commerce startup acquired by Flipkart in 2014. They include CEO Prasad Kompalli and Ashutosh Lawania who started the business in 2017 and were later joined by Ajit Narayanan and Arjun Choudhary, Myntra’s former CTO and head of growth, respectively.

The round is led by Japan’s SBI Investment with participation from sibling fund SBI Ven Capital and another Japanese investor Beenext. Existing Mfine backers Stellaris Venture Partners and Prime Venture Partners  also returned to follow on. Mfine has now raised nearly $23 million to date.

“In India, at a macro-level, good doctors are far and few and distributed very unevenly,” Kompalli said in an interview with TechCrunch. “We asked ‘Can we build a platform that is a very large hospital on the cloud?’, that’s the fundamental premise.”

There’s already plenty of money in Indian health tech platforms — Practo,  for one, has raised over $180 million from investors like Tencent — but Mfine differentiates itself with a focus on partnerships with hospitals and clinics, while others have offered more daily health communities that include remote sessions with doctors and healthcare professionals who are recruited independently of their day job.

“We are entering a different phase of what is called health tech… the problems that are going to be solved will be much deeper in nature,” Kompalli said in an interview with TechCrunch.

Mfine makes its money as a digital extension of its healthcare partners, essentially. That means it takes a cut of spending from consumers. The company claims to work with over 500 doctors from 100 ‘top’ hospitals, while there’s a big focus on tech. In particular, it says that an AI-powered ‘virtual doctor’ can help in areas that include summarising diagnostic reports, narrowing down symptoms, providing care advice and helping with preventative care. There are also other services, including medicine delivery from partner pharmacies.

To date, Mfine said that its platform has helped with over 100,000 consultations across 800 towns in India during the last 15 months. It claims it is seeing around 20,000 consultations per month. Beyond helping increase the utilization of GPs — Mfine claims it can boost their productivity 3/4X — the service can also help hospitals and centers increase their revenue, a precious commodity for many.

Going forward, Kompalli said that the company is increasing its efforts with corporate companies, where it can help cover employee healthcare needs, and developing its insurance-style subscription service. Over the coming few years, that channel should account for around half of all revenue, he added.

A more immediate goal is to expand its offline work beyond Hyderabad and Bangalore, the two cities where it currently is.

“This round is a real endorsement from global investors that the model is working,” he added.

Read More

Microsoft discounts consumer Office 365 by 30% under ‘Home Use Program’

The changes cut the price of an annual subscription for Office 365 Home to $69.99 and for Office 365 Personal, $48.99.

Microsoft now offers discounts of 30% on consumer-grade Office 365 subscriptions to employees of companies with “Home Use Program” agreements.

The savings reduced Office 365 Home subscription to $69.99 yearly, and Office 365 Personal to $48.99.

Home Use Program (HUP) is one of the benefits provided by Software Assurance (SA), in turn either included with some Office licensing categories or optional with others. Although SA may be best known for granting upgrade rights to the next version of a “perpetual” license – such as Office 2019 – it also is included with some subscription-based licensing of, for instance, Office 365 or its more inclusive big sister, Microsoft 365.

HUP has long offered employees of eligible organizations discounts on perpetual Office licenses, those purchased with one-time payments that grant the user rights to run the software as long as desired, even theoretically in perpetuity. The offer of Office 365 Home and Office 365 Personal, however, is new.

Reports in February said that the consumer subscriptions would soon come to HUP; it’s unclear when Office 365 Home and Personal were first offered to HUP participants.

Office 365 subscriptions acquired via HUP will simply extend existing Home and Personal plans the employee may already have. Notably, once purchased at discount, all future renewals will also be at the lower price, even if the buyer no longer works for the organization.

Perpetual license products – Office Professional Plus 2019 for Windows 10 and Office Home & Business 2019 for Mac (macOS) – may also be available to a customer with HUP rights. The prices for these packages quoted to Computerworld‘s staffers – the publication’s parent company, IDG, has HUP rights – displayed even steeper discounts than for Office 365: Office Professional Plus 2019, which lists for $559, was just $19.04, while Office Home & Business 2019 for Mac was only $14.99 (retail price, $249.99). However, Microsoft made it clear that the one-PC-per-license deals were obsolete and likely to be retired from HUP.

“Microsoft is updating the Home Use Program to offer discounts on the latest and most up to date products, such as Office 365,” the company wrote in an item on a FAQ list.

The Redmond, Wash. firm also trumpeted the value of Office 365 Home or Personal, and thus HUP, even though many organizations provide Microsoft’s productivity applications through corporate Office 365 subscriptions. Those at-work plans allow workers to install Office’s apps on multiple devices, including PCs or Macs used at home.

That generosity doesn’t invalidate HUP, Microsoft argued. “The Office license assigned to you by your employer is for your use only,” Microsoft said elsewhere in the FAQ. “This applies whether you access Office from a device at your home or a device provided by your employer. Whereas if you purchase Office 365 Home through the Home Use Program, it can be used by your family.”

Office 365 Home lets up to six family members install and use the Office applications on their devices; each receives 1TB of OneDrive storage space.

More information about HUP – and instructions on how to determine eligibility – can be found on this website.

Office 365 subscription
Microsoft’s now steering eligible customers to consumer-grade Office 365 subscriptions under its Home Use Program benefit, and away from perpetual licenses of Office.
Read More

LG G Watch W100 Repair (SMART WATCH)

LG G WATCH Stuck on boot screen now is fully working order we successfully Repaired LG SMART WATCH.

Read More

The supersized world of supercomputers

A look at how much bigger supercomputers can get, and the benefits on offer for healthcare and the environment

12th November 2018 was a big day in the upper echelons of computing. On this date, during the SC18 supercomputing conference in Dallas, Texas, the much-anticipated Top500 list was released.

Published twice a year since 1993, this is the list of the world’s 500 fastest computers, and a place at the top of the list remains a matter of considerable national pride. On this occasion, the USA retained the top position and also gained second place, following several years in which China occupied one or both of the top two places.

Commonly referred to as supercomputers, although the term high-performance computers (HPC) is often used by those in the know, these monsters bear little resemblance to the PCs that sit on our desks.

Speed demon

The world’s fastest computer is called Summit and is located at the Oak Ridge National Laboratory in Tennessee. It has 2,397,824 processor cores, provided by 22-core IBM Power9 processors clocked at 3.07GHz and Nvidia Volta Tensor Core GPUs, 10 petabytes (PB) of memory (10 million GB) and 250PB of storage.

Summit – today’s fastest supercomputer at the Oak Ridge National Laboratory in Tennessee – occupies a floor area equivalent to two tennis courts

While these sorts of figures include the statistics we expect to see when describing a computer, it’s quite an eye-opener when we take a look at some of the less familiar figures. All this hardware is connected together by 300km of fibre-optic cable. Summit is housed in 318 large cabinets, which together weigh 310 tonnes, more than the take-off weight of many large jet airliners. It occupies a floor area of 520 square metres, the equivalent of two tennis courts. All of this consumes 13MW of electrical power, which is roughly the same as 32,000 homes in the UK. The heat generated by all this energy has to be removed by a cooling system that pumps 18,000 litres of water through Summit every minute.

Oak Ridge is keeping tight-lipped about how much all this costs; estimates vary from $200m to $600m.

So what does all this hardware, and all this money, buy you in terms of raw computing power? The headline peak performance figure quoted in the Top500 list is 201 petaflops (quadrillion floating point operations per second).

Each of Summit’s 4,608 nodes contain six GPUs and two 22-core Intel Power9 CPUs, plus some serious plumbing to keep the hardware from frying

Given that Summit contains over two million processor cores, it will come as no surprise that it’s several thousand times faster than the PCs we use on a day-to-day basis. Even so, a slightly more informed figure may be something of a revelation. An ordinary PC today will probably clock up 50-100 gigaflops, so the world’s fastest computer will, in all likelihood, be a couple of million times faster than your pride and joy, perhaps more.

It doesn’t stop there, either. While it didn’t meet the stringent requirements of the Top500 list, it appears Summit has gone beyond its official figure by carrying out some calculations in excess of one exaflops using reduced precision calculations. Oak Ridge says that this ground-breaking machine takes the USA a step closer to its goal of having an exascale computing capability – that is, performance in excess of one exaflops – by 2022.

Half century

The identity of the first supercomputer is a matter of some debate. It appears that the first machine that was referred to as such was the CDD 6600, which was introduced in 1964. The ‘super’ tag referred to the fact that it was 10 times faster than anything else that was available at the time. See related UK’s supercomputing industry ‘needs to collaborate to survive Brexit’New supercomputers planned at three UK universitiesEU to spend €1 billion on supercomputing strategy

The question of what processor it used was meaningless. Like all computers of the day, the CDC 6600 was defined by its CPU, and the 6600’s CPU was designed by CDC and built from discrete components. The computer cabinet was a novel X-shape when viewed from above, with this geometry minimising the lengths of the signal paths between components to maximise speed. Even in these early days, however, an element of parallelism was employed. The main CPU was designed to do only basic arithmetic and logic instructions, thereby limiting its complexity and increasing its speed. Because of the CPU’s reduced capabilities, however, it was supported by 10 peripheral processors that handled memory access and input/output.

Following the success of the CDC 6600 and its successor the CDC 7600, designer Seymour Cray left CDC to set up his own company, Cray Research. This name was synonymous with supercomputing for over two decades. Its first success was the Cray-1.

Designed, like the CDC 6600, to minimise wiring lengths, the Cray-1 had an iconic C-shaped cylindrical design with seats around its outside, covering the cooling system. Its major claim to fame was that it introduced the concept of the vector processor, a technology that dominated supercomputing for a substantial period, and still does today in the sense that ordinary CPUs and GPUs now include an element of vector processing.

The Cray-1 supercomputer broke new ground when it launched in 1975, thanks to its vector processor and iconic design

In this approach to parallelism, instructions are provided that work on several values in consecutive memory locations at once. Compared to the more conventional approach, this offers a significant speed boost by reducing the number of memory accesses, something that would otherwise represent a bottleneck, and increasing the number of results issued per clock cycle. Some vector processors also had multiple execution units and, in this way, were also able to offer performance gains in the actual computation.

The next important milestone in HPC came with the Cray X-MP in 1984. Like the Cray-1, it had a vector architecture, but where it broke new ground was in the provision of not one, but two vector processors. Its successor, the Cray Y-MP, built on this foundation by offering two, four or eight processors. And where Cray led, others followed, with companies eager to claim the highest processor count. By 1995, for example, Fujitsu’s VPP300 had a massive 256 vector processors.

Meanwhile, Thinking Machines had launched its own CM-1 a few years earlier. While the CPU was an in-house microprocessor, which was much less powerful than the contemporary vector designs, it was able to include many more of them – up to 65,536.

The CM-1 wasn’t able to compete with parallel vector processor machines for pure speed, but it was ahead of its time in the sense that it heralded the next major innovation in supercomputing – the use of vast numbers of off-the-shelf processors. Initially, these were high-performance microprocessor families.

Launched in 1995, the Cray T3E was a notable machine of this type, using DEC’s Alpha chips. And while supercomputers with more specialised processors such as the IBM Power and Sun SPARC are still with us today, by the early 2000s, x86 chips like those in desktop PCs were starting to show up. Evidence that this was the architecture of the future was underlined in 1997, when ASCI Red became the first computer to head the Top500 list that was powered by x86 processors. Since then, the story has been one of an ever-growing core count – from ASCI Red’s 9,152 to several millions today – plus the addition of GPUs to the mix.

Innovation at the top

A cursory glance at the latest Top500 list might suggest there’s little innovation in today’s top-end computers. A few statistics will illustrate this point. The precise processor family may differ, and the manufacturer may be Intel or AMD, but of the 500 computers, 480 use x86 architecture processors. The remaining 20 use IBM Power or Sun SPARC chips, or the Sunway architecture which is largely unknown outside China.

Operating systems are even less varied. Since November 2017, every one of the top 500 supercomputers has run some flavor of Linux; a far cry from the situation in November 2008, when no fewer than five families of the operating system were represented, including four Windows-based machines.

The Isambard supercomputer is being jointly developed by the Met Office, four UK universities and Cray

So does a place at computing’s top table increasingly depend on little more than how much money you have to spend on processor cores and, if so, will innovation again play a key role? To find out, we spoke to Andrew Jones, VP HPC Consulting & Services at NAG, an impartial provider of HPC expertise to buyers and users around the world.

Initially, Jones took issue with our assertion that innovation no longer plays a role in supercomputer development. In particular, while acknowledging that the current Top500 list is dominated by Intel Xeon processors, and that x86 is likely to rule the roost for the foreseeable future, he stresses that there’s more to a supercomputer than its processors.

“Performance for HPC applications is driven by floating point performance. This in turns needs memory bandwidth to get data to and from the maths engines. It also requires memory capacity to store the model during computation and it requires good storage performance to record state data and results during the simulation,” he explains.

And another vital element is needed to ensure all those cores work together effectively.

“HPC is essentially about scale – bringing together many compute nodes to work on a single problem. This means that HPC performance depends on a good interconnect and on good distributed parallel programming to make effective use of that scale,” Jones adds.

As an example, he refers to the recently announced Cray Slingshot as one case of a high-performance interconnect technology. It employs 64 x 200Gbit/s ports and 12.8Tbit/s of bi-directional bandwidth. Slingshot supports up to 250,000 endpoints and takes a maximum of three hops from end to end.

Jones concedes that many HPC experts believe the pace and quality of innovation has slowed due to limited competition but, encouragingly, he expressed a positive view about future developments.

Serious supercomputers sometimes need serious displays, such as EVEREST at the Oak Ridge National Laboratory

“We have a fairly good view of the market directions over the next five years, some from public vendor statements, some from confidential information and some from our own
research,” he says.

“The core theme will continue to be parallelism at every level – within the processor, within the node, and across the system. This will be interwoven with the critical role of data movement around the system to meet application needs. This means finding ways to minimise data movement, and deliver better data movement performance where movement is needed.”

However, this doesn’t just mean more of the same, and there will be an increasing role for software.

“This is leading to a growth in the complexity of the memory hierarchy – cache levels, high bandwidth memory, system memory, non-volatile memory, SSD, remote node memory, storage – which in turn leads to challenges for programmers,” Jones explains.

“All this is true, independent of whether we are talking about CPUs or GPUs. It is likely that GPUs will increase in relevance as more applications are adapted to take advantage of them, and as the GPU-for-HPC market broadens to include strong offerings from a resurgent AMD, as well as the established Nvidia.”

Beyond five years, Jones admitted, it’s mostly guesswork, but he has some enticing suggestions nonetheless.

“Technologies such as quantum computing might have a role to play. But probably the biggest change will be how computational science is performed, with machine learning integrating with – not replacing – traditional simulations to deliver a step change in science and business results,” he adds.

Society benefits

Much of what we’ve covered here sounds so esoteric that it might appear of little relevance to ordinary people. So has society at large actually benefitted from all the billions spent on supercomputers over the years?

A report by the HPC team at market analysis company IDC (now Hyperion Research) revealed some of the growing number of supercomputer applications that are already serving society. First up is a range of health benefits, as HPC is better able to simulate the workings of the human body and the influence of medications. Reference was made to research into hepatitis C, Alzheimer’s disease, childhood cancer and heart diseases.

The Barcelona Supercomputing Centre, located in a 19th-century former church, houses MareNostrum, surely the world’s most atmospheric supercomputer

Environmental research is also benefiting from these computing resources. Car manufacturers are bringing HPC to bear on the design of more efficient engines, and researchers are investigating less environmentally damaging methods of energy generation, including geothermal energy, and carbon capture and sequestration in which greenhouse gasses are injected back into the earth.

Other applications include predicting severe weather events, generating hazard maps for those living on floodplains or seismic regions, and detecting sophisticated cyber security
breaches and electronic fraud.

Reading about levels of computing performance that most people can barely conceive might engender a feeling of inadequacy. If you crave the ultimate in computer power, however, don’t give up hope.

That CDC 6600, the first ever supercomputer, boasted 3 megaflops, a figure that seems unimaginably slow compared to a run-of-the-mill PC today. And we’d have to come forward in time to 2004, when the BlueGene/L supercomputer topped the Top500 list, to find a machine that would match a current-day PC for pure processing speed.

This is partly the power of Moore’s Law and, if this were to continue into the future – although this is no longer the certainty it once was – performance equal to that of today’s supercomputers may be attainable sooner than you might have expected.

Read More