In order to reduce development time, electronic equipment manufacturers and PCB designers need to ascertain the thermal aspects at the earliest possible stage of the design process. In the early days of thermal management thermal analysis to supply such information was conducted mainly by thumb ruling and simplified correlation-based calculations, rendering somewhat of an expected over-design (most recommended Reference below).
With the emergence of computer-Aided Engineering (CAE) and especially Computational Fluid Dynamics (CFD), it was widely adopted for the thermal management of electronic systems and modules to improve design-cycle time and iteration expense and provide a tight thermal prediction which allows the electronic designer an enhanced productivity and utilization of capabilities (Recommended Reference below).
However, as commercial software vendors keep improving the abilities and capabilities of their thermal management packages, the influence of the thermal engineer on performance and reliability of the design has also increased and incorrect thermal design decisions may even result in catastrophic field failures.
The post based on a series of lectures I gave in 2014-2019 and is aimed at providing some “best practice” guidelines for thermal design and analysis of electronic equipment, ranging from pre-simulation thermal modeling, followed by decisions about thermal design, to the final goal of component to package level thermal validation of a thermal design by computerized simulation and analysis.
The methodology shall begin with package level and work its way up to board and then system thermal design and analysis considerations:
- Part I: package level modeling and TIMs.
- Part II: board level modeling and cooling solutions.
- Part III: Chassis level analysis, CFD and turbulence modeling and an objective review of commercial software packages
Part I – Component Level Thermal Design and Analysis
Thermal Modeling of IC packages
As soon as CAE became dominating capable tool, the industry had begun to realize that although state-of-the-art commercial software packages allow solving pure conduction linear equations in a “blink of an eye”, the predictions suffer from severe inaccuracies due to unsatisfactory thermal modeling of the package itself. As a detailed model is proprietary data of a package manufacturer, thermal models are frequently based on a thermal resistance network containing two (e.g. junction to case) or more nodes describing the thermal characteristics of the package by its vendor and based on JEDEC standardization.
Thermal engineers then apply those thermal resistances to construct a simplified component thermal model (CTM) intended to reproduce the thermal performance of a component in a wide variety of simulations, following a common say that “it’s the best they have”. The most common characteristics to be given are the junction to ambient thermal resistance (JESD51-2A) and junction to board/case thermal resistance (JESD51-8).
A two resistor thermal model
However, care is rarely taken to the fact that these characteristics are both environment and boundary condition dependent while the board test measurements are standardized (JESD51-9 to JESD51-11), often rendering such a thermal model as unsatisfactory for the purpose of a general application environment thermal simulation.
The following methodology shall allow for an iterative estimation of missing thermal resistance data for a simple 2-R network representation:
For components missing Rj-b (only Rj-a defined) the data was completed thru simulating the JESD51-2A standard experiment for the definition of Rj-a (Simulation setup is displayed below).
The Standard defines the setup (calibration, exact placement, PCB – JESD51-9/7, vias application, power and environmental conditions).
The methodology of the simulation is to assume a Rj-b in an iterative manner, solving for the components Rj-a and comparing to manufacturer data after calculating Tj.
Simulation of JESD51-2A junction to ambient
thermal resistance experiment
So how can a valid thermal model for a package be constructed?…
The answer to the question came around about 15 years ago, under the umbrella of “DEvelopment of Libraries of Physical models for an Integrated design environment” (or in short – DELPHI). DELPHI models are constructed such that they are Boundary-Condition Independent (BCI). Models that satisfy this requirement can be used in any application-speciﬁc environment and are to be developed by the vendors without any information about the operating environments.
However, such models are usually to be found only for components such as FPGAs, DSPs, Processors and generally those which both highly thermal dissipative and critical in the board (which is somewhat of an understood albeit unsatisfactory situation).
DELPHI compact model (from JEDEC JC15-1)
The following is an example for validating a DELPHI model for an ASIC package:
So if BCI models are seldom given should we not analyze to component level at all?
Certainly not. First, always try to ask the vendor for a sophisticated thermal resistance network if the specific component is known to be close to its derating range for some application-speciﬁc environments upon which its reliability is endangered. Nonetheless, to get truly tight on safety of margin I would recommend a new but highly sophisticated tool developed recently by Mentor Graphics (I’m not getting paid for that, just was given a private webinar… 😉 ) called T3STER, a HD/SW tool for accurate thermal characterization of ICs (transient and steady-state).
Mentor Graphics T3ster for accurate
thermal characterization of ICs
Subsequently to obtaining an accurate thermal characterization of an IC, the next step is constructing a thermal model. In dedicated thermal management commercial software packages (FloTHERM, ICEPAK, Coolit, 6SigmaET, etc…) it is possible to define thermal resistance networks as objects. Otherwise, if a general FEA/FV code is at hand and in the common case of a 2- resistors characterization, a Compact Conduction Model (CCM) comparable to Compact Resistor Model should be constructed according to the relation:
where L is the heat flux average traveling length, A is the average normal to heat flux area of expansion, R is the specific thermal resistance (given by vendor) and K the conductivity to be calculated for the thermal model.
It should be remembered that thermal resistance between two surfaces is essentially The temperature difference between two isothermal surfaces divided by the heat that ﬂows between them, as such, local temperature differences in close proximity to the package top/bottom are not the goal of the analysis (and should be regarded as inaccurate no matter how refined is the mesh), but the global picture of the flux should match the physics.
A further important issue is modeling of thermal vias. Thermal vias whether they are through the entire PCB or blind to a certain layer may improve the normal to PCB plane thermal resistance by a considerable amount. This is especially important for IC’s which contain hundreds of them (such as ball-grid array FPGAs) or when the thermal design is tight on safety of margin due to extreme high temperature environmental conditions such as for airborne electronic equipment or sensitive highly thermal dissipative component. Modeling of such a volume filled with vias is straightforward in dedicated thermal management commercial packages where it is possible to define an object of such, but in general, the simplistic view would be replacing a comparable volume (a box) of the PCB, generally modeled as an orthotropic material (with a normal-to-plane conductance and an in-plane conductance which is about an order of magnitude larger – I shall return to that in Board Level Thermal Design and Analysis) with a volume (same size box) of the same in-plane conductance and an improved normal-to-plane conductance (It’s easy to calculate, happy to share privately on demand 🙂 ).
Thermal vias in PCB
now for some package thermal modeling myths to be refuted…
- The intent of junction to ambient thermal resistance characteristic is solely for a thermal performance comparison of one package to another in a standardized environment. This methodology is not meant to and will not predict the performance of a package in an application-speciﬁc environment.
- The thermal characterization parameters, ΨT and ΨA, have the units of K/W but are mathematical constructs rather than thermal resistances because not all of the heating power ﬂows through the exposed case surface.
- In all cases, a thermal resistance can be deﬁned, but in many practical cases, the physical signiﬁcance of the deﬁnition is subject to doubt, with the exception of those deﬁnitions that incorporate the ultimate heat sink (an ambient at uniform temperature) as one of the nodes. As a consequence, the widespread notion of a universal analogy between electrical and thermal resistance hampers a correct understanding of the physics. It should be remembered always that heat transfer is a 3D phenomena described by Navier-Stokes, an energy equation (e.g. convection and conduction) and a radiation representing relation incorporating reciprocity and an extremely complex and circular cause and effect relations between its dependent variables.
- The accuracy of a compact thermal model as should be related to a calibrated detailed model for specific boundary condition. One may never deduce accuracy of a CTM by CFD analysis which is dependable on many uncertainties ranging from material properties to turbulence modeling.
Cooling enhancement on package level – Thermal Interface Materials (TIMs)
When cooling of highly thermal dissipative components dominated by conduction through the PCB does not suffice, thermal management dictates the addition of another dominating heat flow path mechanism. This is often conducted by the addition a heat spreader, whether it is for improved conduction such as present in ruggedized cover design or on convection mode by the enlargement of the wetted surface area above the component by fins (plate/pin).
Due to micro-level imperfections in the mating surfaces of the spreader and the component, the actual contact area could be as little as 1 % of what is apparent on a macroscopic level (air-filled gaps with very low thermal conductivity of 0.026 W/(m∙K) at standard conditions):
Air gap in touching surfaces due to imperfections (left)
Vs. touching interface filled (right)
In some applications the distance between the surfaces is larger due to the construction of the mechanical outline (tolerance build) or natural accuracy of nominal component size, there will be no contact at all between the materials and a gap filler is needed:
Thermal gap filled by gap filler in non-touching surfaces
Thermal management engineers tend to relate to TIMs according to their material thermal conduction as supplied by the vendor, while the actual thermal resistance is a function of both thermal conduction of the specific material but also the thermal resistance related to the contact between the TIM and the mating surfaces:
TIM thermal resistance
Saying all that, the TIM thermal resistance is far from being the only characteristic important while evaluating which TIM to choose.
- Thermal conductivity within the material – off course there is a direct impact of this parameter on thermal performance, in what follows we shall see that choosing a highly conductive material has its tradeoffs.
- Conformability and “wetting” of surfaces – As explained above, we would like the contact resistance between the TIM and mating surfaces (component and heat sink) to be low.
- Compressibility characteristics of material – in order to achieve a thermally conductive material gap pads, Consist of polymer matrices (low thermal conductivity) are added with high thermally conductive particles or fibers embodied. Trade-off between the ability to deform and the thermal conductivity: The more filler used the harder the pad will get.
- Reliability – While supplying a ruggedized module or a computer processor with a heat-sink attached we wouldn’t like for its performance to deteriorate sharply in a short time. In order to achieve an acceptable deterioration time reliability tests consisting of high and low temperature cycles and a vibratory test should be performed to asses and benchmark TIMS life expectancy.
- Environmental sustainability – Besides long-term reliability it should choosing a specific TIM should be also affected by the immediate environmental conditions that the TIM is going to be exposed to.
Types of TIMs:
- Thermal grease: highly conformable material that wets the surfaces well under low pressures, Often made of silicone, the grease itself has very low thermal conductivity but it is enhanced by loading the grease with highly conductive particles. A very thin layer could be applied due its low viscosity. The main disadvantage is that it is messy during the application, could “pump out”, dry and tolerances must be adjusted accordingly.
- Phase-Change Materials (PCMs): solid at room temperature, but changes to liquid state as it heats up. Mostly organic PCMs are widely the organic type (paraffin, fatty acids, etc…).
Among the advantages of PCMs are a suitable melting point (typically between 50°C and 90°C), high heat of fusion, good stability during thermal cycling, low viscosity in liquid state, high thermal conductivity. The main advantages of PCM over thermal grease are that it is easier to work with and has better stability over time.
The disadvantages of choosing PCMs (over thermal grease) is that thermal performance is slightly lower as both the bulk material thermal conductivity is lower, the surface resistance is higher and a higher contact pressure is needed, increasing the mechanical stresses in the thermal package.
Phase change material
- Gap pads: thicker materials (typically 0.2mm-3mm) that can be used if the surfaces in the thermal interface are not in direct contact with each other, hence serve as gap fillers. Gap pads usually consist of polymer matrices (low thermal conductivity) with high thermally conductive particles or fibers embodied, hence a Trade-off between the ability to deform and the thermal conductivity is met: The more filler used the harder the pad will get.
Among the advantages of using a gap pad is that they can be deformed and are therefore not sensitive to tolerance issues in the assemblies.
The disadvantages of choosing to use gap pads in for the thermal design is that the larger thickness serves as larger distance for heat to travel and hence “bulk” thermal resistance is increased. surface resistance increases or that the TIM loosens from the surface if the applied pressure is too low.
Thermal gap pad
- Putties: The main matrix is often silicon based with filler materials such as aluminium or boron nitride and serve as gap fillers. The denser a putty material is used the harder it is to compress and there is a trade-off between the ability to deform and the thermal conductivity: The more filler used the harder the pad will get. The pressure applied to compress putties is time dependent with a peak during the initial phase before the material relaxes.
Among the advantages of using putties is that they compress at low pressures (good for the components), reusable and have a thermal conductivity of up to 17 W/(m∙K).
The disadvantages of choosing to use putties for the thermal design is that adhesiveness is more pronounced then gap pad making large surfaces filled putty harder to disassemble. Furthermore, they may “pump out” after continuously temperature cycled.
Putty gap filler
- Carbon-based TIMs: By bonding to other elements or to other carbon atoms, a great variety of materials can be formed (nano-materials), all with different mechanical and thermal properties: Diamond, Graphite, Graphene and Carbone-nano-tubes, Pyrolytyc Graphite. It’s all about playing with the connections and their orientation.
Among the advantages of using Carbon-based materials is the control over mechanical properties (CTE, thermal and electrical conductivity, hardness).
The major (and sometimes “show stopper”) disadvantage is that they are very expensive due to production process (as CVD).
Carbon (Graphene) sheet
- Thermal Gels: grease-like from the beginning. However, after they have been applied and conformed to the surface, they are treated with heat to transform into a thin rubber film. gels can be applied to gaps up to 3 to 5mm without sagging.
Among the advantages of using thermal gels are that they are less messy to work with and easier to remove than the grease, don’t “dry out” and could be applied at higher tolerances.
Thermal grease main disadvantages are that they are costly due to a an extra curing step is needed during the manufacturing process and especially that good Care should be taken for method of application.
Don’t forget reliability…
A whole field, touching the thermal design in every aspect is that of reliability.
The electronics industry is realizing that derating strategies are not the best methods for designing optimal electronics. The broad assumptions of derating strategies can result in conservative, expensive designs or designs with insufficient reliability.
In essence, there seem to be an inherent difficulty in knowing how cool an environment components need to operate to avoid thermal degradation of them and of neighboring components.
A more effective and efficient approach is to couple the survey for components that are susceptible to temperature degradation with thermal simulation, such that our trusted thermal analysis tool would provide the infrastructure for a subsequent physics-based reliability analysis conducted by a dedicated, and powerful tool (one such tool is Sherlock DFR)
The following presentation is by no means all encompassing nor highly accurate to cover the reliability of electronics discipline, but the essence is captured, and even more so communicated…
PART II – Board level thermal analysis
In part I, package level analysis and thermal design along with the important topic of Thermal Interface Materials were discussed. Now continuing on our heat flux path route, the next stop is on board level thermal design and analysis. In order for us to understand the possibilities in handling such a task I shall present some cooling solutions applied on board level thermal design, which shall allow the heat flux to continue its route while minimizing the temperature difference from the point of where it left the component (and its TIM) to that of where it leaves the board.
board level Thermal modeling
Thermal modeling of a Printed Circuit Board (PCB)
The PCB is formed by thick layers of a low conductive material (such as FR4), with what could be considered as thin layers of which some percentage of the layer is copper (power, signals, ground).
PCB layers description
As it should be remembered that thermal resistance between two surfaces is essentially the temperature difference between two isothermal surfaces divided by the heat that ﬂows between them, local temperature differences in close proximity to the package top/bottom are not the goal of the analysis (and should be regarded as inaccurate no matter how refined is the mesh), but the global picture of the flux should match the physics, there is no added value in a detailed PCB thermal model (unless for rare cases of which the heating of signals is to be explored) which would prove quite prohibitive from a numerical standpoint.
As a consequence the PCB is modeled as orthotropic material, with in-plane and normal-to-plane thermal conduction.
The equations for calculating each of these conductivities, given the values for the thickness and thermal conductivity of each layer, are presented below:
PCB thermal analysis orthotropic conduction model
Bolted joints and ruggedized cover
As explained in Part I, when cooling of highly thermal dissipative components dominated by conduction through the PCB does not suffice, thermal management dictates the addition of another dominating heat flow path mechanism. This is often conducted by the addition a heat spreader which shall be bolted to the PCB. From a thermal design perspective, the more bolts we use, the lower is the contact thermal resistance between bolted joints. Nonetheless, many bolts mean a lot of drill holes in the PCB and less space to be populated by components. A tradeoff is to be made such that the pressure applied by nearby bolts on highly dissipative and/or sensitive components TIMs would be as to allow the TIM to compress in a predictive manner. An uncompressed TIM compares to an air gap.
Furthermore, it should be remembered that bolted joints do not compare with galvanic contact. It is customary to define a parameter which ascertains a satisfactory contact resistance such as the inverse of the thermal resistance per unit area – h (a value of 3000W/(m^2 K) would be quite conservative). In a thermal management dedicated code, a thermal resistance could be calculated and applied on the contact surface according to the area of contact, otherwise, for a general FEM/FVM code a thermal conduction could be applied to a thin volume representing the contact.
Conduction-based ruggedized cover
In the above picture, a conduction-based ruggedized cover is presented, a cooling methodology which is generally applied for harsh environmental conditions (airborne/sea). Highly dissipative or sensitive components are attached via TIM to the cover which serves as a heat spreader, while the heat flows through the spreader (generally made of Aluminum or Copper) to the sides of the cover where wedge-locks apply pressure to attach the sides of the cover to a chassis (or base-plate in general).
This methodology is termed conduction-based ruggedized cover, as the mode of conduction is dominating the heat transfer process.
Since the attachment of the cover to the chassis is not galvanic, a contact resistance between the ruggedized cover and the chassis is added to the simulation as the two thin colored volume object with conductivity derived from the wedge-locks contact thermal resistance supplied by vendor/experiment (a complete setup of an experiment for attachment contact resistance by wedge-locks estimation would be shared on private request).
attachment contact resistance by wedge-locks estimation (by Curtiss Wright)
Board Level Cooling Solutions
As for TIMs which are a cooling solution applied on the package level thermal design, there are instances of which thermal management dictates the use of board level cooling solutions so to allow the heat flux to continue its route while minimizing the temperature difference from to the point of where it left the component (or TIM) to that of where it leaves the board. Such instances occur when heat dissipation density is high and the addition of a cooling solution may diminish the hotspot by spreading or by thermal shortage.
In the following paragraphs, some recommended cooling solutions shall be presented. As there exists an abundance of cooling solutions, I shall only present those I found to be the most cost-effective and easy to apply. As for the thermal management engineer exploration of new cooling methodologies is always advised, it should be noted that it also advised to keep the cooling solution as simple as possible while still achieving the goal of tightening the thermal design margin of safety. New and exotic cooling solutions should be approached with caution because they often present new technologies which may contain hidden failure modes that are sometimes not so well understood.
A heat pipe is a hollow tube enclosed structure, containing a working fluid (usually water for copper structure) that transfer heat as it evaporates and a wick that brings the fluid back to its starting point when it condenses. As the entire thermal cycle is conducted without outside interference, a heat pipe is considered a passive liquid cooling device.
A typical heat pipe consists of a partially evacuated (such that its pressure is slightly below standard atmosphere), its inside wall aligned with a capillary wick structure (porous ceramic for example) and a small amount of fluid to vaporize in the process. When heat is applied to the hot end, the fluid within the pipe vaporizes and by that generates a force that drives the vapor to the cold end of the tube, where the removal of A heat pipe should operate in any orientation due to the wick’s capillary action, but performance is often degraded when i’s forced to work against gravity and the heat input end is higher than the output and the liquid in the wick is forced to move up the tube.
Heat pipe design and thermal cycle
One of the biggest problems (and one that I get the most questions about in my lectures) concerning heat pipes is reliability. Many thermal designers are concerned that degraded performance shall appear as close as within a few months of operation. This scenario is unaccepted as the slow increase in operation temperature is usually unnoticed until malfunctions or complete failure occur.
The greatest cause for heat pipe degraded performance is contamination that affects the vapor pressure inside the tube. To avoid such occurrence as much as possible care must be taken to the process of sealing the heat pipe by the manufacturer (such as electron beam welding, utilization of clean rooms for heat pipe assembly and a “burn-in” test of at least 50 hours) even more than the nominal characteristics supplied.
In most cooling solution configurations incorporating heat pipes they serve as thermal shortage, but heat pipes may also be used for in-plane heat spreading when applied by attaching two or a network of them in parallel. An extremely ingenious utilization of such a network is presented in the following presentation by Airtop:
A vapor chamber is essentially a planar heat pipe and as such could be defined as a heat pipe device. Its thermal process is also described by the same general thermodynamic cycle, but its role is rather as a heat spreading cooling solution to be placed above high density dissipative component for diminishing a hotspot.
Vapor chamber description
A vapor chamber, much like heat pipes, is easy to apply and is optimally constructed to utilize heat pipe thermal cycle for planar heat spreading.
Vapor chamber utilization for heat spreading purposes
Nonetheless, as vapor chambers are not cheap, the thermal engineer should always consider the realization of vapor chamber operation by the attachment of a series of flat heat pipes, which might not be as effective in diminishing the hotspot but shall certainly be cost-effective.
Aluminum cover with embedded pyrolytic graphite
Carbon is one of the most common atoms on earth. Carbon atoms have four valence electrons and can create bonds with up to four other atoms. This bonding to other elements or to other carbon atoms, allows a great variety of materials to be formed.
One of these materials (perhaps the most known and popular besides diamond) is graphite, formed of a layered structure with strong covalent bonds within the layers, and weak Van-der-Vaals bonds connecting the layers (as opposed to diamond which has only strong covalent bonds – anyone who held a pencil against a diamond should value the difference in structure 😉 ). The thermal conductivity is in the magnitude of 1 000 W/(m∙K) in the plane but only around 5 W/(m∙K) through the thickness, making it a good heat spreader.
The cooling solution presented considers embedding thin graphite plate in an aluminum ruggedized cover, while avoiding air voids which shall hamper its thermal performance by attaching the interfaces by a specific chemical process.
Such a procedure conducted by CPS is presented in the following figure:
AlSiC with embedded graphite cooling solution
Thermo-Electric Coolers (TECs)
TECs create heat flux through the Peltier effect. TEC consists of several NP pellets connected electrically in series and thermally in parallel between two ceramic plates. with the application of DC current of proper polarity, heat is pumped from one plate to the other plate making the first cooler.
I couldn’t leave TECs outside the review as they are mandatory for situations where a passive cooling solution choice will not suffice (as achieving a component junction temperature below that of the boundary conditions), but TECs are extremely low on power efficiency, a fact that could prove quite problematic if power budget is important and furthermore they sometimes tend to hamper the overall MTBF of the board.
Part III – System Level Thermal Analysis
In part I, package level analysis and thermal design along with the important topic of Thermal Interface Materials were discussed. Then, in part II, the thermal route continued to board level as design and analysis guidelines were given and some frequently used board level cooling solutions were introduced. To complete the voyage to its final destination, ambient, the ultimate heat sink, System level design and analysis consideration are going to be discussed, with special emphasis on my favored subject, CFD and turbulence modeling, finishing with my personal experience and recommended choices for thermal management dedicated and general purpose commercial software packages.
system-level thermal management considerations
When thermal designing for system level some crucial decisions have to be made concerning the dominating mode of heat transfer (conduction, natural convection, forced convection, radiation). These crucial choices are ever so affected by the environmental conditions which are often standardized (MIL-STD-810x, RTCA-DO-160, MIL-STD-167, ISO/DIS-19453, IEEE-STD-1478, J1455, etc’…) in order to help tailoring qualification conditions that the mechanical design should meet. It is highly recommended to be aware of the different subtleties that might have an enormous impact on the thermal design.
MIL-HDBK-5400 – guidelines for temperature and altitude
qualification test conditions
Let us take some examples to understand how and why those decisions crucially affect the design. Suppose we would like to improve heat transfer of a ruggedized PCB cover by replacing the material it’s manufactured by from Aluminum (thermal conductivity of 169W/mK) to Copper (thermal conductivity of 395W/mK) to achieve better overall conduction and heat spreading.
Ruggedized cover different designs
Not taking into consideration the environmental standard might result catastrophic corrosion evolving if the ruggedized cover is to be attached to an aluminum ATR chassis and endure salt-fog qualification testing.
As another example, lets say we are to design an electronic unit dissipating 500W. We follow a thermal design guideline of keeping the cooling methodology such as to tighten the margin of safety as much as possible. We perform a CFD calculation knowing that the highest ambient temperature that the unit shall meet is 71°C. now suppose that the unit is to be installed on an F-16, inside a non-pressurized (the density varies with bay temperature and flight altitude) conditioned avionic bay (the bay itself enjoys the aircraft’s ECS).
Conditioned and unconditioned avionic bay
Not prescribing a direct ECS flow to the unit shall have a catastrophic consequence if are to meet the F-16 standard for temperature and altitude qualification test conditions (71°C@60kft) as the fan’s ability to cool is hampered severely by the air density at these conditions (by an order of magnitude).
Pre-simulation analysis and the benefit of the “first law of thermodynamics”
Advances in high-tech industries are strongly linked to Moore’s law predicting an exponential growth in computing resources such that the availability of advanced CAE tools to predict three dimensional complex flows and thermal fluxes not available to almost anyone 20 years ago are now a general commodity.
The availability of CFD created a tendency of thermal engineers to what I call “simulation first”. Many “new-age” thermal engineers confront a thermal management problem with a success oriented methodology, by which simulations are conducted without any appeal to hand calculations (why speculate when you can simulate?…). In many cases where a simpler theory or even a simple desktop experiment would provide an answer with less effort, or even a better answer, there is a temptation to use CFD anyway.
Hand calculations are mandatory! A lot of simulation time may be saved and possibly even overruled by performing a set of simple calculations before resorting to simulations.
A simple first law of thermodynamics based integral calculation such as:
(Where q-heat dissiption, m_dot-mass flow rate, Cp-specific heat, ro-density, Tin-inlet flow temperature to the control volume, Tout-outlet flow temperature from the control volume)
may lead to a decision about which and how many fans to use, what might be a module attachment temperature boundary condition (for the case of a conduction-based ruggedized cover to be attached to a cold-plate cooled by forced convection) and even to a decision to rule out a forced air convection cooling mechanism due to insufficient cooling ability.
Modes of heat transfer and the effect on system level
In system level thermal design the choice of heat transfer mode is a crucial one, affecting both continuous operation in full performance and reliability.
Accommodating heat flux for a desired temperature gradient is a function of the effective heat transfer coefficient derived from the cooling technique chosen.
There are many constrains to such as choice among some are mechanical outline limitations, installation options, cost, environmental considerations and many other, but once a choice has been made the thermal design must support that choice. A chassis designed by cold-plates to be cooled by forced convection shall have lousy performance under natural convection cooling as there is not enough wetted area for buoyancy effects to support such a heat transfer mode.
Furthermore, if installation allows the attachment of the system to a base-plate, then the design must support conduction mode of heat transfer by the system to the platform by creating a heat flux route as short as possible and allow optimal heat spreading. Care must be taken to avoid bottlenecks that may cause high local temperature gradients such as an insufficient area of which normal is aligned with the heat flux, without resorting to over-design.
If the installation does not allow a dominating conduction mode and heat dissipation is such that allows for passive convection cooling in the form of natural convection, then an optimal heat-sink calculation procedure should be conducted (there is an abundance of online guidelines for that). Moreover, care must be taken to designing the conduction route from the hotspot to the base plate as short as possible to avoid bottlenecking the design while pursuing a goal of creating an isothermal attachment surface to the heat-sink as much as possible (e.g. heat spreading).
(AFTC) methodology (VITA 48.5), including a vapor chamber to diminish a hotspot above a 5mmx5mm, 30W component
When heat dissipation density and/or ambient temperature are too high to support natural convection, active cooling by fans should be considered such as to support a point of operation supplying enough volumetric flow to cool the system.
For direct cooling of the board (airborne electronic equipment guidelines do not allow that for example) a tubeaxial fan is the better choice, supplying relatively large amounts of volumetric flow at a low pressure difference, otherwise, if the design dictates indirect cooling (due to harsh environmental conditions for example) such as pushing/drawing the air through a cold-plate (especially for brazed or folded fins methodology, where an increased static pressure difference is expected), the better choice is of a vaneaxial fan designed to deliver a little bit less volumetric flow but at a higher static pressure difference and has the drawback of being much more noisy.
Vaneaxial fan static pressure difference to volumetric flow fan curve (Q-P curve)
Tubeaxial fan static pressure difference to volumetric flow fan curve (Q-P curve)
While performing a simulation of a fan in dedicated commercial thermal management software it is customary to define the fan as an object described by its P-Q curve, outer diameter, hub diameter and swirl, it is most important not to forget to correct the input fan curve according to fan laws (e.g. density correction).
Basic fan laws
When forced convection by air as a cooling fluid does not suffice, active liquid cooling should be considered. Accommodating a heat flux of 50 W/cm^2 to a 50⁰C temperature difference from its final heat-sink requires an effective heat transfer coefficient of ≈20,000 achieved only by liquid cooling (and possible area enlarging factor by heat spreading).
Designing an active cooling system is a complicated task and care must be taken to reliability issues (e.g. leaks) as much as for performance. The most important aspect to remember is that adding active liquid cooling, whether its of the shelf or personal design, means adding another system which might complicate qualification by doubling the qualification process, so its highly recommended to avoid resorting to active liquid cooling if ambient conditions and component sensitivity allow it.
Active liquid cooling
Small scale and miniature vapor compression refrigeration systems
Microchannel evaporators and condensers are vital components of these miniature systems.
A miniature vapour-compression system consists of five main
- miniature compressor
- microchannel condenser
- Throttling device
- Microchannel evaporator
Before turning to some “hardcore” cooling solutions let us define loosely what is actually meant by “microchannel”, and the most loose definition would be to define it as a channel with a hydraulic diameter below 1 mm.
Microchannels are used in fluid control and heat transfer. The concept of the microchannel was proposed for the first time by Tuckerman and Pease. They suggested an effective method for designing microchannels in the laminar and fully developed flow. These single-phase microchannels have experimentally shown to have an increase in their ability to remove heat in confined spaces of which heat seems to be trapped.
Flow Boiling in Microchannels – Nucleate Boiling
When the temperature difference is between approximately 4-13°C above saturation temperature, isolated bubbles form at nucleation sites and separate from the surface.
This separation induces considerable fluid mixing near the surface, substantially increasing the convective heat transfer coefficient and the heat flux.
In this regime, most of the heat transfer is through direct transfer from the surface to the liquid in motion at the surface and not through the vapor bubbles rising from the surface.
•Between 10 °C – 30 °C above TS, a second flow regime may be observed. As more nucleation sites become active, increased bubble formation causes bubble interference and coalescence. In this region the vapor escapes as jets or columns which subsequently merge into slugs of vapor.
The phase ends with what is termed as a surface “dry-out” where the phase of “transition boiling” described below initiates and the heat transfer coefficient drops along with the ability to remove heat, a phase which lasts until the “Leiderman point” is reached initiating the “film boiling” phase of which the wall is too hot for us to take interest as far as electronics cooling is concerned although the heat transfer coefficient sharply rises (for interesting reasons which are out of this post scope).
Flow boiling in microchannels is one of the most promising cooling methods for such huge heat flux densities due to its specific attributes, namely:
- The capability of achieving very high heat transfer rates with small variations in the surface temperature, thereby significantly reducing thermo-mechanical stresses inside the chip (accentuated under harsh military standard environmental conditions).
- The capability of achieving very high heat transfer rates at small liquid flow rates compared to single phase cooling resulting in a very compact cooling system which allows weight and complexity reduction.
- The increase of the heat transfer coefficient with increasing heat flux in nucleate boiling and its comparable independency on flow rate result in a reduction of component temperature and make the use of speed controlled pump redundant, both carrying a substantial impact on MTBF.
However, the transition from laboratory research to commercial applications is hindered by several fundamental issues which are still not well understood, one of which is the lack of a generally accepted prediction methods for flow patterns, heat transfer and pressure drop.
Nonetheless, steps have been taken to develop a comprehensive CFD model for simulation of boiling flow and heat transfer from Nucleate Boiling to Critical Heat Flux on ANSYS Fluent. The features of the model include:
- RPI boiling model (Rensselaer Polytechnic Institute, by Kurul & Podowski in 1990).
- Non-equilibrium Boiling and Critical Heat Flux.
- Interfacial Area : algebraic formulations and IAC equation.
- A range of sub-models for drag, lift and turbulent dispersion.
- Liquid/vapor-interface heat and mass transfer models.
- Multiphase turbulence models: mixture, dispersed & per-phase.
- Flow regime transitions.
CFD and turbulence modeling for electronic equipment
Since the late 1980s, the capabilities of computers and CFD, and our knowledge of how to use them, have grown tremendously. Over that time, CFD-based analysis methods have revolutionized the practice of thermal analysis conjugate heat transfer for electronics. Knowledgeable of how to use CFD can now routinely produce robust designs that don’t need further design changes due to thermal aspects after the initial phase of analysis, only veriﬁcation. Moreover, by applying CFD, we can explore a much larger number of design geometry variations that are not practical to manufacture for development phase only.
Saying all that, there are important aspects to remember while applauding CFD. Deﬁning the surface geometry and generating the mesh for a calculation can represent a signiﬁcant investment in engineering time. Three dimensional simulations require high spatial resolution, computer time is still a signiﬁcant cost even when considering RANS oriented applications, although it is diminishing rapidly as computational resources still increase exponentially and new infrastructures for parallel computing are under development (see CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences ).
Then there is another aspect that should be regarded. Effective use of CFD is a specialized skill that requires time and practice to develop. Effective users must often spend much of their careers learning methodologies of simplifying geometry, generating valid meshes and running codes as well as developing the judgment needed for effectively interpret CFD results.
As there is a vast number of topics comparably important in CFD, turbulence modeling stands above all in most fluid dynamics applications of practical engineering importance. This is especially true considering forced convection cooling of electronics. Confined and complex geometries involved with many obstructions to the flow and swirling flows originating from fans do not lend themselves easily to turbulence modeling.
In general, it is recommended to separate component/board level analysis from system level analysis. this is kind of methodology applies when the heat flux from heat dissipative components is based on conduction to a cold-plate (indirect flow cooling) and no direct cooling of the board exists. In these kind of configurations it is customary to evenly distribute the heat dissipation of a board and solve for the attachment temperature to serve as boundary condition for a subsequent board level simulation where component and board level modeling shall apply.
In what follows, a brief summary of turbulence models frequently used in turbulence modeling are going to be discussed, but before doing that I think that it is important to take into mind that although it is not mandatory for a thermal analysis engineer to master his CFD skills till becoming an expert CFD practitioner, and as thermal management dedicated CFD codes are working on automating the process of actually dealing with many of the vast number of topics crucial for a meaningful CFD analysis (proper mesh resolution, choice of turbulence model, stability, etc…) it is still mandatory to understand at the very least what it is exactly that is being simulated and by which turbulence model approach, for as much as nowadays advanced digital graphics and robust (dissipative) numerical schemes allow to see a complicated ﬂowﬁeld, predicted with all the right general features, and displayed in glorious detail that looks ever so impressing and real, we must always remember the limitations of CFD and especially turbulence modeling when applied to the field of electronics.
Turbulence models and commercial codes
When performing CFD to electronic equipment there are two key observations to make before choosing how to regard turbulence modeling:
- The resolution we are intending to simulate.
Judgment is needed because the simulation of reality that CFD can provide is usually far from perfect. A fundamental physical limitation on accuracy comes from turbulence modeling and a user may inadvertently use a mesh that is too coarse, or, in many cases the densest grid one can afford to use isn’t dense enough to provide the resolution needed for an accurate simulation.
- The code that we are intending to use.
In contrast to some other fields of CFD, the flowfields encountered in thermal management of electronic systems is very diverse as it includes bluff bodies, massive and light separations, transition and relaminariztion, etc’… while general purpose high-end codes seem like the best choice to model such flowfields their “time to market” results is very long. On the other hand, thermal management dedicated codes are very easy for obtaining quick results but suffer from the lack of physical fidelity of their turbulence models with respect to such complicated flows.
I found it better to hold both a general purpose high-end CFD code and a thermal management dedicated code.
For simulations of which an increasing fidelity is to be pursed such as direct fan originating flow over a highly populated PCB, general purpose codes are essential if the practitioner wants the turbulence model to perform inside its range of validity (the exception being ANSYS ICEPAK which actually uses the Fluent solver and has a variety of turbulence models to choose from). Saying that, my first choice for a general purpose code would be Siemens PLM STAR-CCM+ or ANSYS Fluent. The code offers a complete set of tools to deal various levels of turbulence modeling, emenating from various k-ε turbulence Models (standard, renormalization group and realizable), then k-ω SST model which is an exceptional model for confined flows, the Local-Based Transition Model (LCTM) for flows that might exhibit transition and relaminarization, Reynolds Sress Models (RSM) mostly for instances by which curvature and rotation effects should be overcome, Scale-Adaptive Simulation (SAS) for flows of which massive separation is encountered (such as large component turbulent in flowfield exploration) and Detached-Eddy Simulation (DES) variants (and its many variants) where the focus is on a specific high resolution exploration task, where massive separations occur and turbulent shear flow are to be simulated, in order to get better insight of the flowfield otherwise not possible.
The later two families of models (i.e. SAS and DES and its variants) supplemented by Large Eddy Simulation (LES) as standalone or in combination with RANS (i.e. hybrid RANS-LES) are termed Scale Resolving Simulations (SRS) and there are two main reasons for using SRS models that should generally be mentioned.
The first is for applications of which additional information that cannot be obtained from the RANS simulation is needed such as aeroacoustics applications where turbulence generated noise that can’t be extracted from RANS simulations with sufficient accuracy, material failure applications governed by unsteady mixing zones of flow streams at different temperatures dependent on unsteady heat loading, applications regarding vortex cavitation caused by unsteady turbulence pressure fields, calculation of helicopter loads which are strongly dependent on the vortices generated by the tip of the rotor and alike. SRS might be mandatory in such situations even in cases where the RANS model can indeed compute the correct time-averaged flow field.
The second reason for using SRS models is related to the fact that although RANS methodology strength has proven itself for wall bounded attached flows due to calibration according to the “law-of-the-wall”, for free shear flows, especially those featuring a high level of unsteadiness and massive separation it has shown poor performance following inherent limitations as a one-point closure that does not incorporate the effect of strong non-local effects and of long correlation distances characterizing many types of flows of engineering importance.
Considering that RANS models typically already have limitations covering the most basic self-similar free shear flows with one set of constants, there is little hope that even advanced Reynolds Stress Models (RSM) methodologies will eventually be able to provide a reliable foundation for all such flows.
A review of turbulence modeling could be found in the presentation below.
For indirect fan cooling (flow through a cold-plate) where the accepted resolution need not be that high and physical fidelity is limited to extracting board attachment temperature (for subsequent conduction-based board level thermal analysis) but the geometry contains objects such as fans, contact resistances, porous media, etc’… the choice should be a thermal management dedicated code. The first advantage of such codes is the ease of use and the time spent until results are obtained. The main disadvantage is that considering the complex geometry frequently encountered, the two transport equation based turbulence model generally applied for turbulence modeling by such codes, is operating way out of its limits of validity.
Nonetheless, It is common also beyond the thermal analysis of electronics field to solve for complex turbulent flowfields with two transport equation turbulence models. Usually for applications such as electronic equipment thermal management, as the aim is extracting surface attachment temperature for the boards (attached by wedge-locks) at a cold-plate station the results are surprisingly accurate enough and could be even more pleasing when subsequently calibrated by conducting a set of simple engineering experiments.
In choosing a thermal management dedicated software my experience left me quite satisfied with a few. As shortly explained in the above paragraph, ANSYS ICEPAK has by far the best turbulence modeling diversity of turbulence modeling levels. But as the geometry for indirect cooling is very complex (porous media, brazed cold-plates with 0.2mm of fin spacing, obstruction to the flow in the form of semi-rigids finding their way to the flow field due to lack of space, etc’…) and the results are extracted to a reasonable accuracy, other codes which might be very user friendly and robust may be a choice to consider. Mentor’s FloTHERM XT that is much more direct connectivity to board design, proves efficient when transferring large models to different users, a problem I came across many times in the past, will be my first choice for an electronic equipment dedicated software.
A code I found to be very appealing to serve such a purpose is Mentor Graphics FloEFD. The GUI of FloEFD is native CAD (SolidWorks, UG, Catia, Creo) making it very easy to use, especially for those who acquired previous experience with CAD. FloEFD uses the octal tree mesh methodology, which is not very efficient but very easy to control. The software has an electronics thermal management bundle to be added upon purchasing which is not as supportive as its parent thermal management dedicated software, FloTHERM, but satisfactory for indirect system analysis purposes. FloEFD is essentially a general purpose CFD package, allowing the user to exploit it for applications other than thermal management of electronics.
As far as turbulence modeling goes, FloEFD uses the k-ε model for turbulence. If the first grid point lies innermost layer of the turbulent boundary layer (in the “viscous /laminar sublayer”), a low Reynolds model is chosen. Otherwise, if the first grid point is located outside the innermost layer (in the “inertial range”) wall functions formulation is applied. FloEFD does that without user intervention but solely by applying an inherent switching function to decide upon the location of the first grid point and the subsequent adoption of low Reynolds or wall functions formulation.
To conclude Part III, what follows is the important attention one should give to the paradigm chosen. If intensive research of a physical phenomena relating to electronics cooling is to be pursued than an advanced multipurpose CFD code (such as Fluent, Star CCM+) shall be the choice, noting the disadvantage of highly increased “time to market”.
On the other hand, if global (or integral) results are to be achieved on a short time span either to compare different thermal designs or to qualitatively check a concept, a concurrent CFD like FloEFD, dedicated to the specific discipline of thermal analysis of electronics, (along with its many lumped models for specific objects such as fans, porous media, heat sinks, etc…) is highly recommended.
It is the authors view that the paradigm chosen should be purpose oriented, taking into account the functionality and ease of use of a code along with future possibility for interfacing it with other disciplines which are closely related by nature.
So before deciding upon a code taking time for evaluating its large sense cost effectiveness by the many parameters its dependent upon should be regarded as a significant and integral part of the analysis workflow.