“When it comes to imaging or mobile phone photography, the market usually pays more attention to the camera module itself, or the core CMOS image sensor (CIS). At present, competition in the smart phone CIS market is still very fierce. More demand is shifting from 8-inch wafers to 12-inch wafers. At the same time, as the demand for CIS with more than 40 million pixels increases, pixel process nodes are also getting smaller.
When it comes to imaging or specific mobile phone photography, the market usually pays more attention to the camera module itself, or the core CMOS image sensor (CIS). Currently, competition in the smartphone CIS market is still very fierce, and more demand is shifting from 8-inch wafers to 12-inch wafers. At the same time, as the demand for CIS with more than 40 million pixels increases, pixel process nodes are also getting smaller.
This change is probably not good news for Sony, which has the largest share of the CIS market in mobile phones. In August of this year, a set of unverified data from sources appeared on Twitter, pointing out that in the first and second quarters of this year, the image sensor market share of Samsung and Sony had shrunk to the closest ever. Sony’s image sensor market share dropped to 42.5% in the second quarter of this year, and Samsung’s market share rose to 21.7%. From the perspective of “International Electronic Business”, this is related to Samsung and even more market participants, such as SK Hynix, in terms of the advantages of high-pixel-related process technology.
The value of the imaging market is probably undergoing a shift. Since smartphones occupy the largest market share in the imaging field (Yole Developpement data in the middle of last year showed that mobile CIS accounted for 70% of the entire CIS sales), this article mainly uses the application of smartphones as an example to talk about the imaging market The ongoing transformation-the market originally dominated by CIS is gradually shifting to image/visual processors, such as AI cores, ISP (image processor), etc. This change will create greater market value.
In addition, the particularity of smartphone imaging lies in the fact that imaging in other fields, such as medical imaging, machine vision in the industrial field, etc., is based on the main goal of “photographing” at the image sensor level, and pays more attention to the post-processing and calculation of image data. . And mobile phone photography has always been “good shots” as the main goal, and it has attached great importance to image sensors for a long time.
Smart phone manufacturers still prefer the high pixels and large size of the CIS itself when promoting the selling points of their photos. However, the determinants of imaging quality have shifted from CIS to image data processing and calculation. It also reflects the challenges of the development of traditional optical technology due to the technological development of digital chips and the rapid advancement of AI technology.
Signs that began to appear in the last two years
MediaTek put forward the concept of “true AI camera” in 2018. The concept includes three main factors: 1. High-resolution, large-size CIS; 2. Multi-core ISP; 3. High-performance AI core. Among them, the first point is the consensus in the imaging field, and the last two points are related to post processing of image data.
If ISP is processing data, then AI and other vision processors are doing more in-depth calculations on data (Computing). The importance of ISP has always been mentioned repeatedly in the past, but its position in the imaging field, especially in mobile phone photography, is far less than that of CIS. In addition, the AI core is also a sweet pastry in the imaging field in the past two years. On this basis, the marketing concept of “True AI Camera” was put forward in essence to attract terminal equipment manufacturers to adopt MediaTek SoC products, but it really raised the ISP and AI core to the same level as CIS .
Whether it is an ISP specially equipped for the camera or an AI processing unit, their application in taking pictures can be regarded as the popular Computational Photography in the past two years. The general public’s understanding of “AI photography” may still remain at the level of beauty, face recognition, removing the background, or making the sky bluer and grass greener. In fact, AI’s assistance in imaging has gone deep into all aspects of taking pictures, which will be discussed below.
In addition to chip manufacturers such as MediaTek, Google’s performance is also worthy of attention. According to “International Electronic Business Information”, Google equipped its Pixel 2 mobile phone with a special Pixel Visual Core (Pixel Visual Core, Figure 1) in 2017, which is a SiP package image/visual based on the Arm system independently designed by the company. processor. This processor can be regarded as a fully programmable image, vision and AI multi-core dedicated architecture (domain-specific architecture) chip, and its application on Pixel 4 is iterated as Pixel Neural Core (Pixel Neural Core).
Of course, Google Pixel series phones are generally more forward-looking and testing the waters in the mobile field. Google has many years of precipitation in the field of Computational Photography. They believe that compared with Qualcomm’s internal SoC ISP and AI Engine capabilities, dedicated image processing hardware for camera research is more efficient.
Figure 1. Inside the Pixel Visual Core of a Pixel phone
In the pre-smart phone era, external ISP/DSP was a common concept, but with the arrival of the general trend of chip integration, contemporary image processing hardware rarely exists outside the SoC in an independent form. Google’s approach has further enhanced the status of the image/vision processor: Although the solution of an external independent image/vision chip may not become a trend, post-processing has become a more important part in all aspects of taking pictures. .
Google Pixel phones have a more interesting tradition: the same model of CIS can be used on two generations of Pixel phones in succession. For example, the main camera of Pixel 3 and Pixel 4 uses a suspected Sony IMX363 CIS. Even so, the camera performance of mobile phones will still have a leap, this feature has always been talked about. This also shows that Google attaches great importance to image processing in imaging, rather than just focusing on image sensing.
Looking back at this year’s Qualcomm Snapdragon 865 for imaging stacking material: The ISP part of the Snapdragon 865 supports a speed of 2 GigaPixel per second, as well as 4K HDR, 8K video shooting, and photo shooting with a maximum of 200 million pixels. In cooperation with the fifth-generation AI Engine, this ISP can quickly identify different shooting backgrounds, people, and objects. Today, Qualcomm will focus on promoting the imaging in each generation of Snapdragon flagships.
Let’s look at Apple’s A14 this year. The increase in CPU and GPU performance is not that big, but the Neural Engine of the AI core part has increased to 16 cores, which increases its computing power to 11TOPS; A14 The CPU also includes an upgraded machine learning AMX module (matrix multiplication accelerator). Today, AI processors on mobile phones are always criticized for not having many application scenarios, but they are silently playing a role in Computational Photography.
Increasingly clear market status
Sony launched two “smart vision sensors” in May this year-IMX500 and IMX501. The company claims that this is the world’s first image sensor to incorporate AI processing capabilities. The sensor part of these two chips is a typical back-illuminated CIS; and the integrated edge AI processing part includes the logic chip of the DSP, and also includes the temporary storage space required by the AI model, which is a typical edge AI system. More rigorously, I am afraid that IMX500/501 should not only be defined as a “sensor”.
When these two chips cooperate with cloud services, they only obtain metadata as output in the data processing stage, which can reduce data transmission delays and reduce power consumption and communication costs. The essence of this type of design is to integrate part of the “post-processing” capability into the image sensor. In this way, higher-precision, real-time object tracking can be performed when recording video. Currently, these two sensors are mainly used in retail and industrial equipment.
In addition, in supporting solutions, Sony has also launched a software subscription service for CIS that integrates AI capabilities. The potential market value of adding AI data analysis is greater than the sensor market itself. Although Sony does not expect this service to be profitable in the short term, it is very optimistic about its long-term development. Even if the IMX500/501 is not geared towards smart phone products, this step can also reflect the change in Sony’s thinking in the development of CIS business: that is, it begins to expand from pure image sensing to image/visual processing. After all, the growth rate of the traditional CIS market is slowing down.
In the middle of this year, Yole Developpement released a report entitled “2019 Image Signal Processor and Vision Processor Market and Technology Trends”. The report clearly mentioned: “AI has completely changed the hardware in the vision system and has an impact on the entire industry.”
“Image analysis has added a lot of value. Image sensor vendors are beginning to be interested in integrating the software layer into the system. Now image sensors must go beyond the ability to simply capture images and then analyze images.”
“But running such software means high computing power and storage requirements, and the emergence of visual processors. The compound annual growth rate of the ISP market from 2018 to 2024 has stabilized at 3%, that is, the market value of ISPs has reached It will reach 4.2 billion U.S. dollars in 2024. At the same time, the visual processor market will also usher in explosive growth, with a compound annual growth rate of 18% from 2018 to 2024. By 2024, its market value will reach 14.5 billion U.S. dollars.”
Figure 2, 2018-2024, image/vision processor shipments and market size expectations
Of course, this value has not yet reached the total annual value of CIS. The sum of the above two markets approximately exceeds the size of this year’s CIS market (this year’s CIS industry output value is estimated to be $17.2 billion). It also needs to be noted that the growth rate of the CIS market is slowing down and the software market for visual processing chips is not considered here. At least Sony believes that its long-term market development potential is greater than the CIS market itself. Yole Developpement’s forecast data shows that ISP’s share in the market will gradually decrease, and visual processors that pay more attention to computing are obviously more in demand (Figure 2).
“It’s worth noting that many traditional industry participants are relatively cramped in responding to the AI trend. This also allows more other participants to join the business competition, such as Apple, Huawei, Mobileye and other start-up companies, and even other fields. Companies, such as Nvidia.” This is a manifestation of the expansion of the imaging market.
What exactly does AI bring to mobile phone photography?
In March of this year, DxOMark, a well-known French imaging laboratory, mentioned in an article that in the past 10 years, the picture quality of smart phones has improved by more than 4EV, of which 1.3EV comes from the improvement of image sensor/optical technology, and 3EV comes from image/ The improvement brought by the vision processor (post-processing of image data). This has basically subverted the public’s basic understanding of CIS technology to improve the quality of photographs.
Image/visual processing is a fairly old and developed topic for many years. AWB (automatic white balance), ANR (active noise reduction), 3DNR (3D noise reduction), BLC (black level correction), HDR, etc. originally belong to ISP Of regular items. In the past two years, AI photography has been mentioned the most functions in image post-processing, including face recognition, subject recognition, semantic segmentation, intelligent beauty, etc.
These are indeed the value that AI brings to imaging, but AI’s participation in the improvement of the image quality of mobile phone photography has penetrated into the above-mentioned conventional projects. A lot of Google’s research in Computational Photography also involves these components. For example, for automatic white balance in low-light scenes, traditional algorithms are powerless in white balance correction. Google applied machine learning a few years ago: by inputting a large number of photos with white balance correction in place to the model, to train an intelligent model of automatic white balance.
Google has applied machine learning to many aspects and features of Pixel mobile phone imaging. For example, real-time HDR when taking pictures and viewfinder, and anti-shake for video shooting. In the post-processing of data, first perform motion analysis, obtain gyroscope signals, and combine optical anti-shake operations in the first stage; secondly, combine machine learning and signal processing in the motion filtering link to predict the motion trajectory of the camera itself; and finally, the final The frame synthesis link compensates for picture distortion caused by shutter and micro-motion.
Figure 3. Source: Google AI Blog
A more typical example is to simulate the background blur effect. The traditional solution to simulate the background blur is mainly based on stereo vision, while the solution proposed by Google not only relies on two stereo vision solutions (the dual camera and dual pixel technology of Pixel 4 mobile phones), but also in order to strengthen the reliability of the blur, the picture subject Semantic segmentation: Google created a five-camera device to shoot a large number of scenes and collect enough training data. Use Tensorflow to train a convolutional neural network: firstly process the dual-pixel and dual-camera input data separately, there is an encoder in the middle to encode the input information into IR (intermediate layer), and then the two parts of information pass through another encoder to complete the final Object depth calculation (Figure 3). The encoder here is a kind of neural network.
In April of this year, MediaTek researchers published a paper entitled Learning Camera-Aware Noise Models, proposing a method for modeling image sensor noise, through “a data-driven method to learn from real environmental noise Noise model. This noise model is related to the camera. Different sensors have different noise characteristics, and they can all be learned.”
These examples all show that more and more market participants at different levels are investing in image post-processing. Therefore, the Google Pixel mobile phone using the old model of CIS still maintains an advantage in many imaging project competitions compared with other mobile phones using hundreds of millions of pixels CIS. The solution of an external AI vision chip obviously gives Google more room to play.
Today’s mobile phones have begun to widely use AI to enhance the image quality, and include the participation of traditional links such as framing, noise suppression, and automatic white balance. From a user perspective, AI chips will not have a strong perception of participating in computing.
When these technologies become more and more common in the imaging field, the CIS-centric theory of mobile imaging in the past has become more and more ineffective. Today’s terminal manufacturers have shifted their focus to image/visual processing and calculation when taking photos with mobile phones. After all, the speed of traditional optical technology development cannot be compared with that of digital chips.
Nowadays, many people use mobile phones to take pictures and compare them with full-frame cameras. Even if this comparison has no practical significance, it can also reflect the image/visual processing and computing power of mobile phones, which greatly compensates for the shortcomings of mobile CIS. In fact, this is also a competition between two solutions and two eras.