Super Hi-Vision at the London 2012 Olympics
Padrão técnico superior Prosseguindo os trabalhos iniciados em nossa última edição, onde publicamos os primeiros Proceedings da NAB, nesta edição trazemos os frutos de outra parceria firmada recentemente. Intercaladas com os trabalhos da associação americana, passaremos a trazer também os Technical Papers da SMPTE.
A sigla, que quer dizer Society of Motion Pictures and Televisions Engineers, sempre trouxe um peso de credibilidade e seriedade para a indústria da radiodifusão. Sobretudo quando estamos falando de trabalhos técnicos e científicos e principalmente normatização e padronização.
É ao SMPTE que devemos creditar muitos das iniciativas que renderam inovações em todos os setores da indústria broadcast, desde as tecnologias de filmagem e captação, até a edição e transmissão de conteúdos. A SET tem muito orgulho de firmar mais esta parceria e trazer à seus associados e leitores da Revista da SET este tão prestigiado conteúdo.
Esta ação faz parte do contínuo esforço que a Sociedade Brasileira de Engenharia de Televisão têm feito para posicionar nosso mercado em um patamar global. E também é integrante a um esforço de aprimoramento de todos nossos produtos editoriais e científicos, levando em conta as necessidades dos profissionais que fazem parte de nosso mercado.
Para começar esta edição, trazemos um artigo que aborda um dos principais avanços da atualidade: as transmissões em altíssima resolução. Com o advento das tecnologias de 4K e testes em 8K, muitos desafios cercam os radiodifusores, tanto em termos de infraestrutura, e também em modus operandi.
Não se trata somente de novas câmeras, mas também de novos fluxos de trabalho e principalmente de uma nova forma de encarar a produção de conteúdo. A forma como se assiste televisão vai mudar, o que puxa os alicerces de toda a industria construída em volta deste hábito.
Olímpio José Franco
Presidente da SET
The Oscar® and Emmy® Award-winning Society of Motion Picture and Television Engineers (SMPTE), a professional membership association, is the worldwide leader in developing and providing motion-imaging standards and education for the communications, technology, media, and entertainment industries. An internationally recognized and accredited organization, SMPTE advances moving-imagery education and engineering across the broadband, broadcast, cinema, and IT disciplines. Since its founding in 1916, SMPTE has published the SMPTE Motion Imaging Journal and developed more than 650 standards, recommended practices, and engineering guidelines. SMPTE members include motion-imaging executives, engineers, creative and technology professionals, researchers, scientists, educators, and students from around the world. Information on joining SMPTE is available at www.smpte.org/join.
Super Hi-Vision at the London 2012 Olympics
By Masayuki Sugawara, Satoru Sawada, Hayato Fujinuma, Yoshiaki Shishikui, John Zubrzycki, Rajitha Weerakkody, and Andy Quested
For the London 2012 Olympics, the BBC and Japan Broadcasting Corp. (NHK) in cooperation with the Olympic Broadcasting Service (OBS) held public viewings of Super Hi-Vision at venues in Japan, the U.K., and the U.S. This operation consists of program production at the Olympic venues, post-production at a BBC studio, IP distribution of live and nonlive programs through global and domestic IP networks, and regular presentations during the Olympic Games at nine venues in three countries.
During the London Olympics, the British Broadcasting Corporation (BBC) and Nippon Hoso Kyokai (NHK) held Super Hi-Vision (SHV) public viewings (PVs) at venues in Japan, the U.K. (Fig. 1), and the U.S., in cooperation with the Olympic Broadcasting Service (OBS).
SHV is a next-generation broadcasting system being developed by NHK, which reproduces strong sensations of presence,1 as though the viewer was actually there, through the use of ultra-high-resolution 33-megapixel video (7680 pixels horizontally by 4320 pixels vertically) and 22.2 multichannel audio (22.2ch audio).2
A total of seven events were presented in SHV during the London 2012 Olympic Games: the opening and closing ceremonies, swimming, basketball, athletics, cycling, and synchronized swimming. Content created at the venues was carried by optical fiber to a temporary production and transmission base that had been established in TC0, a studio at the BBC Television Centre in London. Live content was sent “as-is” via the studio in Television Centre, while other content was edited and packaged there before delivery. Both live and nonlive programs were compressed into transmission streams (TSs) at a rate of approximately 280 Mbits/sec before distribution via IP lines to the PV venues in the U.K., the U.S., and Japan for presentation (Fig. 2).
In addition, an uncompressed signal was sent from the BBC Television Centre studio to a PV theater at the International Broadcast Centre (IBC) within the Olympic Park. This theater was for members of the international media community so they could experience SHV viewing of the events around the Olympic venues.
The opening and closing ceremonies and the swimmingcompetitions were presented live in the U.K. and the U.S., but because of time differences, only the morning (local time) section of the July 30 swimming competition was shown live in Japan.
Outside Broadcasting-Van System (OB-Van)
Two trucks were used at each location: one for video (Fig. 3) and the other for audio (Fig. 4). They were staffed with a single crew, and after each event they had to de-rig the equipment before moving to the next venue.
The video OB-van was a rental truck that was furnished with empty equipment racks before being shipped to London from Japan. Once there, it was rigged with the SHV production equipment. This included an eight-input switcher system along with two SHV cameras (three at the opening and closing ceremonies), two SSD liveslow-motion devices, an up-converter (to up-convert the host HD signal), and graphics equipment.
Video Engineer Operation For video production, three 28 in. UHD-1 (3840 x 2160) monitors were used instead of a large SHV monitor basically because it would not fit into the van! Three video engineers (VEs) were assigned to operate two cameras; they were positioned in a row, VE1 (in charge of Camera 1), VE2 (Chief VE), and VE3 (in charge of Camera 2). The role of the VE2 was to check the color and iris adjustments for both cameras (Fig. 5).
As the SHV camera viewfinders (VF) are only HD resolution, it is very difficult for de camera operators to adjust focus accurately, so currently the VEs must handle camera focus. Although the VEs’ monitors are only UHD-1 resolution, the focus adjustment can be carried out with SHV precision by using a camera control unit function that crops the image and displays a portion pixel mapped to the UHD-1 display. In the future there will be a requirement for the camera operators to handle focus control, especially for multicamera events; it is therefore vital that this part of the operation is improved.
For three-camera operation (at the opening and closing ceremonies) a temporary monitor was installed and a technical director took care of the focus.
Three cameras,3 each with four 1.25-in. CMOS image sensors, were used at the opening and closing ceremonies, while for all other events this number was reduced to two. Each camera had two lens options, a 5 x 12mm to 60mm lens (Fig. 6) and a 10 x 18mm to 180mm lens. The lenses were selected according to the requirements of each venue.
One feature of SHV is the different optimal viewing distance. While Hi-Vision is best viewed at three times the screen height, SHV is best viewed at 0.75 times the screen height. Thisgives a horizontal field of the view of about 100º.SHV there fore covers a much wider field of view with high-resolution images, which produces a very immersive sensation. When watched on screens larger than 400 in., viewers get a sensation of being wrapped in the field of view, producing a real sense of presence without actually recognizing any information in the peripheral areas unless they consciously look at that portion of the screen. This also means shots must be captured with the most important action in the center of the screen. The camera viewfinder has “action area” markings to help the camera operators maintain a good shot framing (Fig. 7).
To cover the action, it was decided each venue would have a master or base camera in a high position while the second camera would give a different angle from a lower position. The second camera position was primarily used for closeups Depending on the venue, the low-position camera gave an image with more depth and a greater sense of presence, so it was switched in and out with the base camera shot during each event. The camera positions for athletics and the opening and closing ceremonies are described in Fig. 8.
The following guidance was given to the OB crews to help maintain the sense of presence during the presentations:
(1) Position the primary subject(s) in the frame so that it draws the viewer’s attention.
(2) Follow the primary subject(s) to maintain this position in the frame.
(3) Do not use unnecessary pans or zooms.
It is very important that camera work, especially the focus and the position of the primary subject(s) in the center of the frame, is carefully controlled in order to maintain the viewer’s sense of realism. Unnecessary or unintentional camera movement will damage the natural sensation of presence.
It is also important that the images and the sound complement each other. To achieve this, a 5.1-channel audio downmixed from the 22.2ch audio was provided in the video OB-van production room (Fig. 9). This enabled the video switching to be coordinated with the sounds coming from the venue, including the cheering crowds.
As video monitoring was limited to 28-in. UHD-1 (3840 x 2160) displays, technicians had to struggle with a perceptual gap between the images seen in the OB-van and those seen at the large PV screens. To check the content in a real viewing environment, the OB crew occasionally checked the 145-in. PDP display in the IBC theater or at the 300-in. screen PV venue in London.
AnaudioOB-van(Fig. 4) rentedintheU.K. fortheOlympicperiod was fitted with a 22.2ch live mixing board and 22.2ch speakers (Fig. 10). The mixing board was developed at the NHK Science & Technology Research Laboratories and its operation optimized for live production, including 3D sound-image positioning (3D panning) functions.4
The 22.2ch-audio one-point microphone (Fig. 11) was a fixed, monolithic microphone holder consisting of a 45-cm sphere divided into upper, middle, and lower layers. Each layer was partitioned into eight directions by sound baffles with a compact microphone installed in each partition.
The microphone was positioned near the base camera. Unfortunately, placing the microphones freely in the venues was not permitted; therefore, audio was produced with the one-point microphone as a base and mixing in the microphone feeds distributed from the international feed. To help to reproduce the expansive sound of the venues, 3D reverb equipment (22.2ch audio reverb) was used.
The 22.2ch output feeds were recorded to a multitrack recorder in case the signals from live PVs needed to be used again for other presentations.
Presentations at the IBC on-site theater gathered a large audience of broadcasters from around the world, which suggests there is an increasing interest in SHV, while at the Olympic venues, people from the media visited the video and audio OB-vans in groups. Visitors were welcomed and given explanations of SHV video and 22.2ch audio. This was important because in addition to the PVs, one of the objectives of the trial was to promote SHV. Broadcasting staff (Fig. 12) who came to see the high-resolution video and SHV system were given an explanation using an UHD-1 monitor in the OB-van, while in the audio OB-van, visitors were able to sit in the main-mixer seat and experience the 22.2ch sound.
For SHV PV production at the London Olympics, transmission, play-out, video editing, audio post-production, and monitoring operations were carried out in the BBC TC0 studio (Fig. 13). Cabling and layout were carefully planned to maintain the quality of the SVH signals and monitoring.
Line and Transmission Systems
An SHV 8 x 8 routing switcher system (an HD 128 x 128 router configured to switch 16 channels simultaneously) was used to access the various resources in TC0, including the live signals (main and backup) brought in from each venue by optical lines and the P2 recorders (main and backup) used for playing back the content (Fig. 14). An additional and separate monitoring system was built to allow the signals from the relay site to be checked even during a PV screening (Fig. 15).
For the nonlive competition content, the main and backup feeds were recorded on two SHV P2 recorders in the TC0 studio. Each SHV recorder used 16 P2 cards with an additional card to record HD proxy data. After each competition, two edited packages were produced, a World Feed for the U.K. and U.S. and a Japan Feed for Japan, using two SHV editors. These were edited simultaneously as 10-min segments with content dependent on medal standing or popularity of the event. Several segments were then combined into a package of approximately 45 min’ duration.
Editing was carried out overnight from midnight to 10 a.m. to be ready for distribution to Japan and to be prepared for the presentation in the U.K. and the U.S. starting at noon. HD material was used as proxy data for an off-line edit that produce EDL data (Fig. 16). The SHV image was then conformed using the off-line EDL data (fig.17). Captions were created as SHV (7680×4320) TIFF files by the graphics overlay equipment and imported to the SHV editor. Finally, a 24-channel WAV file of the complete audio master was imported and merged into the SHV editor time line to complete the content (see discussion in the next section).
For the live transmissions, the on-site audio mix was transmitted from the audio OB-van to the PV venues, but for the edited packages a 22.2ch multichannel audio post-production process was required. Two simultaneous audio edits were produced (to match the two video edits) using the HD proxy editing data in two audio postproduction systems. The first consisted of a digi tal audio workstation (DAW) with a mixing desk that supports 22.2ch audio, and a backup DAW and was set up in the audio mixing room (Fig. 18). The second audio post-production system was set up in the SHV preview room (Fig. 19) and consisted of a DAW and the preview room’s 22.2ch sound speaker system (see discussion in the next section). Audio post-production made it possible to control the overall audio balance for multiple events, improving the quality of the content.
SHV PREVIEW ROOM
A preview room with an 85-in. SHV LCD monitor and a 22.2 multichannel audio system was used to carefully monitor the live events and edited packages (Fig. 20). The room was primarily set to manage the quality of the signals, but it also allowed previewing on a large monitor at the ideal distance and with the ideal speaker layout. This was used to give feedback and advice to the on-site teams on the camera work and vision switcher timing to help give a better and more realistic sensation of presence.
DISTRIBUTION AND SCREENING
Distribution and Screening Overview Distribution and screening refers to the delivery of the packaged content and live signals from the BBC studio to each of the PV venues and the presentation at the venues.
Within the overall system shown in Fig. 2, “Distribution and Screening” corresponds to the section after the BBC studio. This includes encoding of the baseband signals from the BBC studio equipment, transmission over IP, the transport system that decodes and restores the baseband signals, and the screening system that reproduces the video and audio from the baseband signals.
IP networks within the U.K. were used to transmit signals to the three locations (London, Bradford, and Glasgow), and dark fiber was used to transmit uncompressed signals to the IBC within the Olympic Complex. Global IP networks were used to transmit to Washington, D.C., in the U.S. and to Tokyo in Japan. In the U.S., the domestic network was used to transmit the signals from the terminus of the global network to the venue within the U.S.
A system diagram of the distribution system, consisting of encoder, transport equipment, and IP network, is shown in Fig. 21.5,6
The encoder employs AVC/H.264 for video encoding and MPEG-2 AAC-LC for audio encoding (Fig. 22).7 Prior to video encoding, a format converter was used to convert the so-called dual-green format (equivalent to Bayer pattern) signals to ordinary YUV signals. During this process, the green signal composed of two diagonally pixeloffset signals of 3840 x 2160 pixels is converted to the 7680 x 2160 format by shifting both of the two signals by half a pixel. The Y signal with 7680 x 2160 pixel is then calculated from this new green signal and the up-converted red and blue signals. The YUV signals were partitioned into eight 1920 x 1080 pixel signals (1080-60P signals) for the operation.
An encoder for each of these elementary signals compressed the 1080-60P signal using AVC/H.264 and up to four audio channel signals using MPEG-2 AAC-LC. Each encoder unit used a prefilter to switch the intensity of a low-pass filter accordin to bit rate and picture pattern.8. This prefilter enables encoding with little deterioration over a wide range of bit rates and controls picture quality, maintaining high quality in the region of interest.8 Finally, the MPEG-2 TS signals output by the encoder units were multiplexed into two MPEG2-TS signals by TS multiplexers. After the MPEG-2 TS signals output by the encoder equipment were converted to IP signals, they were transmitted over IP networks.
In order to transmit the 280 Mbit/sec compressed MPEG- 2 TS SHV video and audio over global IP networks, the required bandwidth had to be secured. Fluctuations that occur on the network had to be compensated for, and a mechanism for maintaining security was needed. Specifically, functions to control jitter for synchronous transmission and for real-time encoding and decoding along with function to handle packet loss as well as functions for advanced error correction were needed. All these functions were built into the IP transmission terminal equipment.
The decoding equipment performed the same processing as the encoding equipment in reverse in order to produce the baseband video and audio signals.
The IP signals were carried with the cooperation of the research IP networks in the countries involved. The use of UDP for live streaming meant that any packet drops would cause visible and audible impairments. Forward error correction was used to counter this, with a value of 20% being used for the network in the U.K. Stable network operation was achieved with no uncorrectable packet losses for several days at a time. However, achieving this situation took several months of careful planning and testing with the network providers involved.
A diagram of the screening system is shown in Fig. 23. Content was distributed from NHK STRL to the PV venues in Japan, and from the BBC Studio TC0 directly to the PV venues in the U.K. and the U.S. The video and audio signals were decoded then presented with SHV projectors and 22.2ch audio equipment at each venue. Each venue was also equipped with TS recording and playback equipment, which was used as a backup system if the lines between NHK STRL or the BBC studio and each venue experienced difficulty or if, for any reason, any of the venues needed to make their own playback schedule. A baseband signal generator is also built into the system to enable adjustments to the audio and video systems.
At the PV venues, two types of projector were used as well as an 85-in. LCD, a 145-in. PDP, and a 360-in. multi- screen LCD. At five of the theater venues, the SHV project tors had 8-megapixel display devices and a pixel-shifting technology called e-shift to increase the resolution.9 This had the advantage of reducing the size and power consumption of the projector. The projector in the NHK Minna no Hiroba Fureai Hall had 33-megapixel display devices for RGB,10 making it a so-called full-resolution projector. This projector had a high output power because it was for use on the venue’s very large 520-in. screen.
The 85-in. LCD11 was used for direct viewing in the Washington, Akihabara, and Fukushima venues. The 85-in. display was combined with a sound system to demonstrate a possible home system. The 145-in. PDP12 was used to set up a theater in a room at the IBC, and the 360-in. multiscreen LCD was installed at the entrance to the NHK Studio Park tour for NHK visitors.
A new 22.2 multichannel audio system was developed for direct viewing displays and was combined with the 85- in. LCDs and the 145-in. PDP. In theaters using a projector, the theater sound system was used.
|COUNTRY||CITY||VENUE||SCREEN SIZE||DISPLAY DEVICE|
|JAPAN||Tokyo||NHK Minna no Hiroba Fureai Hall||520||Projector|
|NHK Studio Park||360||Multi screen LCD|
|Belle Salle Akihabara||300||Projector|
|Fukushima||NHK Fukushima broadcasting station||350||Projector|
|U.K.||London||BBC Broadcasting House||300||Projector|
|Bradford||National Media Museum||250||Projector|
|Glasgow||BBC Pacific Quay||350||Projector|
|US.||Washington D.C.||Comcast (NBCU)||85||LCD|
TABLE 1. LIST OF SCREENING VENUES.
SCREENING VENUES AND REACTIONS
The display equipment and screen size used at each PV venue is shown in Table 1.
In Japan the PVs were held from July 28 to August 12 at the NHK Minna no Hiroba Fureai Hall in Shibuya (Fig. 24), NHK Studio Park, in the Belle Salle Akihabara event space near the JR Akihabara station, and the NHK Fukushima broadcasting station. At each of these venues several related events were held at the same time to attract more people and to ensure that those attending really enjoyed the event. The Shibuya venue attracted families during the summer holidays, the Akihabara venue was aimed at young people and those interested in technology, and the Fukushima venue appealed to supporters of local athletes. In total, more than 200,000 people attended the events over the Olympic period. There were exclamations of wonder as the screening began, many comments were received on the strong sense of presence and intensity of the presentation (Table 2).
In the U.K., the BBC took the lead with cooperation from NHK. Screenings were held from July 23 to August 12 at the BBC Broadcasting House in London, the National Media Museum in Bradford, and the BBC Scotland building in Glasgow. At the IBC, the OBS took the lead providing demonstrations targeted at broadcasting community.
Before the opening ceremony, a short program introducing SHV was screened along with a sequence of “sights and sounds” from around London and the Olympic venues, produced by NHK and the BBC in the run-up to the games. Many dignitaries and people connected to the broadcasting industry attended the screenings, and there were many comments about how wonderful the images were as well as inquiries about when the SHV broadcasting would begin.
In the U.S., NBC took the lead with cooperation from NHK, and screenings were held from July 27 to August 12 in meeting rooms in the Comcast building in Washington, mainly for invited guests from government and the content, communications, and electronics industries.
|It was just like I was watching it at the Olympic.|
|I could understand what the realistic sensation time.|
|The impact was [so] great that I could almost smell fireworks at the opening ceremony|
|Felt like the person behind me was clapping.|
|Could see the natural 3D image.|
|Could clearly see the person who threw a bottle.|
|Want to believe this is the first step to the next.|
|Blurring with pan shot is bothersome.|
|Image is very clear but it needs a large screen questionable whether it fits home viewing.|
TABLE 2. LIST OF TYPICAL COMMENTS.
The development of SHV has been advanced, with “presence” as its strongest feature. These events have once again shown that an extremely strong sense of presence can be delivered by SHV video and audio and that unprecedented levels of emotion can be imparted to viewers, giving them a sense that they were actually at the Olympic venue. By producing and transmitting programs continuously, every-day during the Olympic Games using live coverage and recorded and edited content, it has also been demonstrated that SHV can be operated much like “ordinary” broadcasting and SHV broadcasting is a realistic plan for the near future.
The production style, without voice-overs (announcing or comments) and mainly using wide camera angles and long (slow) cut ratios, received many comments of surprise and admiration about the new possibilities it presents to broadcasting businesses.
1. M. Sugawara, K. Masaoka, M. Emoto, Y. Matsuo, and Y. Nojiri, “Research on Human Factors in Ultra-high-definition Television to Determine its Specifications”, “SMTE Mot. Imag. J., 117(3):23-29, 2008.
2. K. Hamasaki, T. Nishiguchi, R. Okumura, Y. Nakayama, and A. Ando, “A 22.2 Multichannel Sound System for Ultrahigh-Definition TV (UHDTV),” SMPTE Mot. Imag. J., 117(3):40-49, 2008.
3. K. Arai, S. Mitsuhashi, D. Ito, H. Fujinuma, R. Funatsu, and. T. Kikkawa, “Newly Developed UHDTV Camera System,” presented at the IBC2010 Conference, Amsterdam, the Netherlands, Sept. 9-14, 2010.
4. A. Ando and K. Hamasaki, “Sound Intensity Based Three-Dimensional Panning,” presented at the AES 126th Convention, paper #7675, Munich, May 2009.
5. Y. Nojiri, K. Iguchi, K. Noguchi, T. Fujii, and M. Ogawara, “National Super Hi-Vision Transmission Test Using IP Networks for Global Research and Education,” Broad. Technol. (Hoso Gijutsu), 64(6):135-141, 2011 (in Japanese).
6. S. Sakaida, K. Iguchi, N. Kimura, M. Ogawara, and T. Fujii, “International Super Hi-Vision Transmission Test and Exhibition of Related Equipment at IBC2011,” Broad. Technol. (Hoso Gijutsu), 65(1):151-156, 2012 (in Japanese).
7. Y. Shishikui, K. Iguchi, S. Sakaida, K. Kazui, and A. Nakagawa, “Development of High Performance Video Codec for Super Hi-Vision,” presented at the 65th NAB Broadcast Engineering Conference, pp. 234-239, Las Vegas, April 2011.
8. A. Nakagawa and John L. Pittas, “High Qos and High Picture Quality Enable the HD Revolution,” SMPTE Mot. Imag. J., 117(3):55-63, 2008.
9. F. Okano, M. Kanazawa, Y. Kusakabe, M. Furuya, and Y. Uchiyama, “Complementary Field Offset Sampled- Scanning for GRB Video Elements,” IEEE Trans. Broad., 58(2):291-295, 2012.
10. T. Nagoya, T. Kozakai, T. Suzuki, M. Furuya, and K. Iwase, “The D-ILA Device for the World’s Highest Defini tion (8K4K) Projection System,” Proc. Intl. Display Workshop (IDW), 15:203-206, 2008.
11. T. Kumakura, M. Shiomi, S. Horino, Y. Yoshida, and S. Mizushima, “Development of Super Hi-Vision 8Kx 4K Direct-View LCD for Next Generation TV,” SID 2012 Digest, pp. 780-783, 2012.
12. K. Ishii, T. Usui, Y. Murakami, Y. Motoyama, M. Seki, Y. Noguchi, T. Furutani, T. Nakakita, and T. Yamashita, “Developments of a 145-in. Diagonal Super Hi-Vision Plasma Display Panel,” SID 2012 Digest, pp. 71-74, 2012. _______________________________________________
A Contribution received November 2012. Copyright © 2013 by SMPTE.
“This article is republished with permission from SMPTE. The original article was published in the SMPTE Motion Imaging Journal, January/February 2013.”