![]() In the research by Yamaguchi et al., FP21 precision formats were proposed, and FP21 contributed to accelerating seismic simulation on GPUs. The digit number of the exponential part is 3, and it is often unusable for practical use. The expressive ability of the exponential part of FP16 is little. In the research of low-precision formats, an adaptive precision such as FP21, which is not standardized by IEEE754, was proposed. However, there are few practical examples of FP16 because of poor expressive ability. In response to such situations, the use of FP32 in computer simulations has also been reexamined, and it has been reported that the accuracy of FP32 is practical. The recent GPU and A64FX CPU of Fugaku and Wisteria/BDEC-01 Odyssey supercomputer support FP16. The use of lower precision formats such as FP32, FP16, and FP8 has been widely discussed and has yielded practical applications. In recent computer science research, the use of low-precision floating-point formats such as single-precision(FP32) and half-precision(FP16) was reexamined, and mixed-precision was also studied to reduce computational time and power consumption. Therefore, in this study, we also evaluate the performance of ELLPACK(ELL) and Sell-C- σ storage formats, which can efficiently use SIMDization. Additionally, to reduce the calculation time by using low precision, it is necessary to maintain high memory width consumption by an effective SIMDization. Therefore, in this study, we evaluate the usefulness of FP16 in the ICCG method with several conditions of the problem, and we also proposed the implementation for CPU of and evaluated FP21 and FP42 adaptive precision whose expressiveness is higher than FP16. When we apply FP16 to the ICCG method, It is difficult to solve the ill-condition problem because of the poor expressiveness of the exponent of FP16. Using FP16 can further reduce the amount of memory translation from FP32, so high effects can be expected if it can be used with the ICCG method. Also, the use of half-precision(FP16) is examined in the field of machine learning, and hardware support of FP16 on GPUs and some CPUs is advancing. Recently, the use of single-precision(FP32) in the ICCG method with well-conditioned problems is discussed for reducing computational time and power consumption. Double precision(FP64) is generally used in computer science, and an incomplete Cholesky preconditioned conjugate gradient(ICCG) method, which is widely used in computer simulations, is a typical example.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |