A deep dive into FinFET, DRAM and 3D NAND manufacturing processes

Infineon / Mitsubishi / Fuji / Semikron / Eupec / IXYS

A deep dive into FinFET, DRAM and 3D NAND manufacturing processes

Posted Date: 2024-01-22

We'll take a deep dive into FinFET, DRAM, and 3D NAND manufacturing processes, as well as upcoming changes to Gate All around, CFET, and 3D DRAM process flows. These scaling trends particularly severely impact deposition, lithography, and etching. This in-depth study is also of great significance to Lam Research, ASMI, Tokyo Electron and Applied Materials. To this end, ASMI specifically proposed a statement, but KE shows that this statement is not entirely correct.

International electrical and ALD expert focusing on thin film deposition

Kokusai Electric ("KE") has long been part of the Hitachi Group, formerly known as Hitachi Kokusai Electric (HKE). In March 2018, it was separately listed on the Tokyo Stock Exchange. HKE has two main business groups, semiconductor thin film deposition process solutions and video and communication solutions. These two groups have completely different business models, products and customer bases.

In 2016, HKE decided to reorganize the company and spin off various departments to different private equity investors. Some players, including private equity firms KKR, HVJ Holdings and JIP, have also jumped on the bandwagon. In 2017, private equity firm KKR acquired Hitachi International, paying $2.3 billion for the entire business. Thin Film Process Solutions remains 100% owned by KKR, while the Video and Communications businesses were spun off into an independent company.

Not long after, in July 2019, Applied Materials announced that it wanted to acquire the new Kokusai Electric, the thin film process solutions business only, for $2.2 billion (later raising the offer 59% to $3.5 billion in 2021). The acquisition was subject to antitrust review, and AMAT halted the acquisition after two years of waiting for approval from Chinese authorities.

After five years of holding, KKR turned to an exit through the KE IPO. KKR will sell 30% of KE's shares in secondary transactions, and KE will no longer issue additional shares or raise funds.

International Electric ("KE") positions itself as an expert in thin film deposition. Specifically, KE sells batch deposition tools, with a particular focus on batch atomic layer deposition (ALD), the most selective/precise form of deposition. KE is also involved in surface treatment but lags far behind deposition in terms of revenue.

About 30% of their revenue is also related to services, but it should be noted that part of KE's services segment is the sale of traditional 200mm equipment. This means KE currently has lower service intensity than other deposition players, which is good for the fab's total cost of ownership (TCO) but could be a financial drag. Having said that, KE's new mini-batch tool has 4x the service attach rate and is the best TCO tool they have ever released.

KE focuses more on the storage business, with NAND being the largest application for its equipment, followed by DRAM, and then logic. With the adoption of Gate All around, that may start to change.

KE's expertise does not cover all chemical vapor deposition (CVD) and atomic layer deposition (ALD), but specifically batchCVD/ALD tools. Other major tool manufacturers (Tokyo Electronics, Applied Materials, Lam Research, ASMI) all offer ALD tools, but they mainly focus on single-wafer products, i.e. processing 1 wafer per chamber, rather than multiple (batch) ) wafer. KE dominates the batch ALD process with about 70% of the market share, with the remainder mainly belonging to TEL (and ASMI).

While KE emphasizes their presence in the ALD space, they are still exposed to batch CVD tools. KE's hybrid market share in batch deposition is "only" about 46% (compared to about 70% in batch ALD). TEL is only slightly larger in the total volume deposition market, but we believe this will soon reverse and KE will become even larger.

This is a very strong market position and explains why Applied Materials wanted to acquire KE. AMAT already has a strong ALD product portfolio. Although the market believes that ALD leader ASMI will completely dominate and continue to grow share, AMAT still has many processes of record (POR: Processes of Record) for next-generation gate-all-around (GAA) logic ALD steps.

AMAT's expertise lies in single wafer processing and they wanted to add KE's batch capabilities to their product portfolio. To be clear, batch ALD and single-wafer ALD are completely different functions. Expertise in single-wafer ALD does not necessarily transfer directly to volume ALD tool production. AMAT's acquisition attempt is actually a recognition of KE's proficiency in this segment.

Next, we dive into the underlying technologies behind KE's products: deposition, ALD, and batch processing.

Deposition, ALD, and Batch Processing, a Deep Dive

Deposition, as the name suggests, is the process of depositing specific materials onto a wafer. Although we refer to chips as "silicon" because the base substrate used for chip manufacturing is silicon, there are actually many different materials present on the wafers being manufactured. These materials (usually different metals and oxides) are placed onto the wafer through deposition.

There are several forms of deposition used to deposit different materials: electrochemical deposition (ECD) also known as electroplating, physical vapor deposition (PVD) and chemical vapor deposition (CVD), of which atomic layer deposition (ALD) is a subset . Let's take a quick look.

1. Electrochemical deposition/electroplating

Electroplating is a common technique for depositing a thin film of metal onto another metal surface, and its basic concept remains unchanged in semiconductor manufacturing. Electroplating is often used to deposit copper, for example to build interconnects of metal layers, or to fill through silicon vias. Place the silicon wafer and copper source in a bath of conductive liquid. Both the silicon chip and the source are connected to the power supply. An electric current is applied and the current from the power source dissolves the ions from the source and deposits the source ions onto the silicon wafer.

2. Physical vapor deposition/sputtering

Physical vapor deposition (PVD) uses a plasma (similar to the mechanism of plasma etching) to generate metal vapor from the target material. The kinetic energy of the plasma causes the target material to be sputtered onto the wafer and deposited. PVD is commonly used to deposit barrier layers and copper seeds for metal interconnect layers and various forms of nitride liners.

3. Chemical vapor deposition

The most common type of deposition is chemical vapor deposition (CVD). CVD is commonly used on dielectrics and other metals. In CVD, various precursor gases are vented into the chamber. These gases diffuse onto the wafer, react chemically, and form the material that is deposited on the wafer.

For example, (di)silicon oxide is a common dielectric/insulating material. To deposit silicon oxide, one method is to mix precursor gases of silane and oxygen, which react to form silicon dioxide and hydrogen byproducts.

4. Atomic layer deposition

Atomic Layer Deposition (ALD) is a subset of CVD. In ALD, one precursor gas is emitted at a time. A first gas is fired to coat the silicon wafer. An inert gas (such as N2 or argon) is then flowed into the chamber to purge any excess of the previous precursor gas and any by-products. Then the second reaction gas flows. An atomic layer of the second reactant is attached to the surface of the silicon wafer, where the layer of the first reactant and these elements will react to create the target material. Repeat this process to form a film of desired thickness.

The main characteristic of ALD is that it is a self-limiting process. Only one layer of atoms is deposited at a time. Once the surface is saturated, precursors cannot be further deposited. This is why ALD is attractive as a deposition method.

It produces films with very high conformality, step coverage, no pinholes, and allows precise film thickness control. These are common pitfalls faced by other forms of deposition. ALD is particularly important for more challenging deposition tasks: very thin film deposition and surface deposition with complex topography (i.e. non-planar surfaces), such as 3D or very high aspect ratio structures.

Basically, ALD outperforms traditional CVD in every aspect in terms of deposition quality. The problem is that ALD is much lower and IE costs more floor space, tooling time and money. Only one gas precursor is emitted at a time, followed by a purification process to remove excess precursors and by-products, which also adds additional processing time.

This involves multiple steps compared to traditional CVD which is done in one step. All this results in the deposition of only a single atomic layer. For very thin films this is not a problem, but for thicker films ALD is less attractive. To solve the problem of low throughput, one solution is to batch this process.

To batch or not to batch, that is the question

Instead of processing one wafer at a time, batch tools can process multiple wafers (sometimes hundreds of wafers) to increase throughput. As mentioned previously, it is KE's batch processing capabilities that set them apart in a highly competitive field of ALD vendors (ASMI, Lam Research, Applied Materials, Tokyo Electron).

The benefits of batch processing are clear: process more wafers at once, thereby increasing throughput and lowering tool ownership costs. However, batch processing also has some disadvantages. For example, because the chamber is much larger, it is difficult to control the process conditions. Additionally, with multiple wafers within the chamber, more undesirable interactions can occur, leading to defects.

If anything, batch processing was more common in the early stages of semiconductor manufacturing. Over time, the steady trend is toward greater use of single-wafer tools because they provide more control and flexibility in a world where leading-edge processes have tighter process tolerances.

For a process like ALD, batch processing has huge advantages because it helps solve ALD's main drawback: low throughput. At the same time, as we discussed earlier, ALD is also self-limiting, making control inherent to the process and offsetting the higher defect rates that can come with batch processes.

Another thing to note about batch processing is that it is suitable for processing large numbers of homogeneous wafers, which is often the case in memory, but not in logic. Logically, while the highest volume chips do require thousands of wafers per month, and in some extreme cases tens of thousands of wafers, it is also common to run dozens of wafers at a time in any given design . This means that batch tools cannot even saturate, and the hybrid design means that the process cannot be optimized for the specific wafer data collected from metrology/inspection.

Accuracy aside, the first question to answer is is batch size really more productive than single wafer? If not, then batching doesn't make sense.

While it may seem intuitive that batching 100 wafers at a time would be more productive than processing one wafer at a time, the reality is not that simple. Batch processing introduces significant additional overhead compared to single wafer processing, thereby increasing cycle time. For example, batch processing uses larger process chambers, so it takes longer for the chamber to correct process parameters such as temperature, and it takes longer to acclimate to the fab environment after deposition.

This additional overhead can be amortized more efficiently when the equivalent single-wafer deposition process has longer processing times, making batch processing more efficient than single-wafer processing. In other words, wafers with characteristics required for certain functions increase deposition cycle times, making it more likely to gain productivity advantages through batch size.

First, the main feature that increases deposition time is high aspect ratio (“HAR”) structures. Aspect ratio is the ratio of height to width: deeper and narrower structures are therefore considered high aspect ratio structures. HAR structures greatly increase the exposed surface area required for deposition. Likewise, larger walls () will take longer to get a new coat of paint, and larger surface areas will take longer to become saturated with atoms.

Another reason is depth loading, which we see as a NAND etch challenge and why Tokyo Electronics was able to gain 3D NAND market share from Lam Research. The same principle applies to deposition. Basically, it takes longer for gas to penetrate deep and narrow trenches. However, unlike etching, this solution does not perform the process at low temperatures, as ALD requires temperatures above room temperature.

KE has a large batch tool - AdvancedAce. The tool can batch-process up to 175 wafers at a time (this only applies to CVD, not ALD), while Tsurugi is actually a "low-batch" tool and can batch-process up to 50 wafers at a time. time. The reason for the small batch size is that it is a smaller chamber that takes less time to reach process parameters (such as heating and cooling) and provides higher gas flow rates than a larger chamber. This can provide a better balance of overhead time and deposition time to optimize final throughput.

We usually see high aspect ratio structures in memory, not logical structures. For NAND, this is a channel hole with an aspect ratio >70:1. DRAM also has high aspect ratio trenches for capacitors.

A typical example is that because the process throughput is high enough, the throughput of batch ALD is lower than that of a single wafer of 48-layer 3D NAND. Due to depth loading, the two become comparable at 64 layers. Beyond that, batch processing becomes more efficient for higher layers as the gap between the two grows. Why? More layers means higher aspect ratio channel holes. A higher hole aspect ratio means longer machining time. Longer processing times flip the throughput equation in favor of single-wafer batch processing.

There are other factors that make batch processing more common in memory. Batch tools have longer downtime because the chamber only operates when full. Wafers can sit idle in the processing chamber, waiting for other wafers to complete other preparation steps before being deposited. This is less of an issue in memory because the wafers are very homogeneous compared to logic fabs running multiple different wafers.

For Logic, the flexibility and fast cycle times of single-use tools allow for more variation in process conditions, thus speeding up R&D and prototyping. In many cases, foundries only want to run a few wafers, a concept known as a "hot lot," which is a concept Intel uses too much and costs them billions of dollars. This reduces utilization but gets data to the design and production teams as quickly as possible. Designing iterations, qualifying samples, or adjusting process parameters to increase yield is a constant logistical struggle.

Compared to logic and foundries, memory fabs are more cost-sensitive because memory is a commodity. At the end of the day, cost is the only differentiating factor for a product. In memory, the process is fine-tuned, and then you run hundreds of thousands of that product over many years.

Additionally, memory wafers are much cheaper compared to logic wafers. Leading logic wafers cost around $20,000 each, which is very expensive if you produce 175 wafers in batches and the process doesn't work properly.

Applications of Atomic Layer Deposition

Where is ALD best used? In general, ALD is used for structures with complex morphologies, especially high aspect ratio or 3D structures, or for very thin films. For critical films, the key is to achieve good "step coverage": ensuring a uniform thickness deposited on uneven substrates. As manufacturing processes become more 3D, step coverage becomes more difficult to achieve. If you are trying to fill the walls of a deep trench, it is important that the bottom fills at the same rate as the walls near the trench mouth.

Today, all NAND flash is 3D, so we see a lot of ALD used for it. Logic is becoming more 3D, there are gates around structures, and 3D DRAM is on the roadmap. Structurally, this means higher etch and deposition intensity overall. Likewise, the lithography intensity of 3D NAND is also reduced.

Specifically, let's first look at the application of ALD in 3D NAND.

First, we deposit alternating layers of oxide and nitride films onto a base silicon wafer. The thickness of each layer ranges from 20 to 30 nm. The theoretical limit for each stack can be over 250 layers high, approaching 7 microns high. A thick hard mask is then added to prepare for high aspect ratio (HAR) channel hole etching. This reactive ion etching process creates a series of holes that are 70 times as deep as they are wide. Aisle hole roundness and uniformity throughout the hole depth are critical to reducing variability in memory cell performance. Repeat these steps for designs with multiple decks, then stack them on top of each other.

From this, the channel hole is filled with multiple layers to form a charge trap cell, with each layer deposited on the sidewalls gradually narrowing the hole. Next is the metal replacement gate process. Etch slits through all layers to form trenches that expose the sides of the stack. This allows for foldback of the nitride layer and subsequent barrier deposition via ALD and tungsten wordline filling. Steps are etched into the sides of the array to expose the wordline layers to vertical contacts.

Finally, bit lines and metal interconnects are formed on top and connected to the fabricated CMOS circuitry, which includes word line drivers and other peripheral circuits for the NAND interface. From this we can see that 3D NAND is highly dependent on HAR etch and deposition capabilities to expand density and performance.

NAND etching gets a lot of attention, but be aware that there's also a lot of deposition.

The figure below is a cross-section of a 3D NAND memory cell. You can see a lot of different materials used. There are six types of films that require ALD: barrier oxide, charge trap nitride, tunnel oxide, and channel silicon. This is in addition to the tungsten fill in the base silicon layer and word lines. For barrier oxide, charge trap nitride and tunnel oxide, KE's batch ALD is the record-breaking process tool among all top 5 NAND players. For channel silicon, barrier metal, and barrier metal, KE is everywhere, but actually KE dominates the first 3 steps.

Usually, when it comes to the topic of competition among semiconductor manufacturers, we would say that the market is complex. It is difficult to categorize deposition as just one market as there are many different segments and niches, each with its own leader. People realize there's a lot less competition. This is a great example. For these specific NAND deposition steps, KE clearly dominates.

Secondly, DRAM has also begun to embrace ALD.

In DRAM, the high aspect ratio features are capacitors. Each bit of data is stored in the capacitor as a negative or positive charge. Each capacitor is connected to a transistor, which controls access to the data in the capacitor. This is the single-transistor, single-capacitor (1T1C) memory cell architecture on which DRAM is based.

The capacitor itself is a long cylindrical structure with a high aspect ratio. It's filled with metal-insulator-metal stacks. The insulator is high-k zirconium dioxide, which prevents leakage while maintaining capacitance. Such MIM stacks require ALD due to the need to form well-controlled conformal films in high aspect ratio structures. This is the step where KE has a strong influence in the DRAM field. For example, we know that KE's batch ALD is used for some parts of Samsung's high-k deposition, and may also be used with other DRAM manufacturers.

Currently, a key challenge in further shrinking DRAM is reducing the size of the capacitors. Any further shrinkage of the capacitor renders the capacitor unable to hold a charge, rendering it useless. Like NAND, 3D DRAM is proposed as a future architecture to enable continued cost scaling.

Most equipment manufacturers believe that 3D DRAM will be in mass production in the second half of this century (except ASML, which insists that 3D DRAM production will take place well beyond 2030). What exactly a 3D DRAM architecture will look like has yet to be determined, as there are few potentially viable architectures. This is an opportunity for ALD and etch, but it is also a threat to some tool manufacturers, because it will lead to a reshuffle of market share.

Finally, logic also became the target of ALD.

ALD's first major entry into logic manufacturing was in 2007, when Intel introduced ALD in its 45nm process. Previously, silicon dioxide was used as gate insulator. As feature size shrinks, the silicon dioxide layer also shrinks, but it was found that at thicknesses around 2 nanometers, the silicon dioxide cannot properly isolate current flow.

Intel's 45nm node introduced a revolutionary High-K Metal Gate (HKMG) structure that greatly reduces current leakage and is a key enabling feature for scaling beyond the 65nm node. The HKMG structure replaces traditional insulating silicon oxide with hafnium oxide and uses metal instead of polysilicon for the gate. High-K dielectrics are achieved through ALD. The hafnium film needs to be highly conformal, pinhole-free, and have a tightly controlled thickness to achieve its insulating purpose, which makes it ideally suited to the task of ALD. Additionally, ALD wins because the more standard CVD process leaves excess particles for hafnium oxide deposition.

Then, as logic entered the FinFET era and transistors became 3D (instead of planar), the need for ALD increased further. We can see that for FinFETs, the gate surrounds the channel on three sides and sticks out like a fin. The effect is that the gate can better control the current flowing through the transistor, thus reducing leakage and requiring a lower voltage to power the transistor. The gate oxide also wraps up and is no longer a flat film, making it more difficult to achieve step coverage. While this task is already handled by ALD, we can see that this new more challenging topology only makes ALD better suited for the task.

Overall, logically we don't see structures with as high an aspect ratio as in memory. Despite this, batch ALD is still used by TSMC, which is KE's second largest customer. Some films require ALD, but are simpler (as opposed to very complex and critical films using ASM single wafer tools) and require multiple iterations of the process per wafer, so batch processing is advantageous when considering total cost of ownership of.

In this case, KE and TEL's batch ALD tools are more like workhorse deposition tools than the dedicated single-wafer tools from ASMI, Lam Research, and AMAT. One example is batch ALD used to deposit spacers on the gate sidewalls of FinFETs. The purpose of the spacer is to reduce the capacitance between the gates and is a low-k film.

As you can see, the spacer needs to be deposited on top of the protruding gate, which has a relatively high aspect ratio compared to the high-k dielectric between the gate and channel.

This is where batch processing and KE tools logically come in. Another function where batch ALD is well suited is gap filling for trench isolation. Shallow trench isolation is a technology used to prevent unwanted electromechanical interference and parasitic leakage between individual circuits. The trenches are created through an etching process and then filled with a dielectric such as silicon dioxide, which can be done through batch ALD.

For FinFETs, we found that wrapping the gate on 3 sides of the channel improved the electrical characteristics, so wouldn't coverage on all four sides be better? Basically, yes, it would be better. This is the approach to next-generation all-gate (GAA) transistor architecture. The channel becomes a series of multiple vertically stacked nanosheets within the gate. The gate now surrounds the channel in all four directions compared to only three in FinFETs, allowing greater drive current and leakage control, thereby improving power consumption.

Looking closely at the gate, it's actually a stack of high-k metal gates surrounding each nanosheet (denoted "Epi Si" in the image below). Controlling the threshold voltage requires multiple dipoles and work function metal layers.

ALD is necessary to deposit these films because they must be thin to all fit within such shrinking doors. It can be seen that GAA requires more ALD steps compared to FinFET.

Review Editor: Huang Fei

#deep #dive #FinFET #DRAM #NAND #manufacturing #processes