Apple announcement of FaceTime, their new video telephony solution, included various standards: H.264, AAC, SIP, STUN, TURN, ICE, RTP, and SRTP. If this is the case, it would offer great start for interoperability, as the very same standards are widely used by the modern video communications solutions by majority of the vendors.

Here’s a short explanation on the role of each standard:
H.264 – one of most prominent video compression standards in use today. Used by every major and minor video communications solution today, from the mobile video to desktop to room system to Telepresence system. Has AVC (advanced video coding) and SVC (scalable video coding) profiles – AVC profile is what is widely used and interoperable today, while SVC profile holds great promise once interoperability will be established. Already used on the iPhone and other i-devices.
AAC – advanced audio coding standard. Widely used today in audio and video communications and has established interoperability. Used by iPhone and other i-devices.
SIP – Session Initiation Protocol – de-facto standard of IP Communications solutions, including both Voice and Video communications. Used by majority of video conferencing vendors, such as Cisco, Tandberg, Polycom, RADVISION and more. Also one of the core standards in 3GPP IMS ( IP Multimedia Subsystems) communications.. Highly interoperable, however, lacking dedicated definitions for IP Video Call Control – IMTC SIP Parity Activity group developed set of best common practices and use cases to improve interoperability of SIP –based video communications.
STUN, TURN and ICE – typically used together to support Firewall and NAT traversal functionality. STUN ( Session Traversal Utilities for NAT) allows to map internal IP addressed behind the NAT to the external IP addresses. TURN (Traversal Using Relay NAT) is used in rare cases where STUN doesn’t provide a solution. ICE (Interactive Connectivity Establishment) is used as an umbrella standard utilizing STUN, TURN, uPnP and others in order to find possible way to transfer the NAT and Firewall.
RTP, which stands for Real-time Transport Protocol, is used in Voice and Video over IP implementations to carry over real-time media and collecting of the statistics, such as jitter and latency, which in turn allows to implement better quality solutions. Today used by absolute majority IP voice and video vendors.
SRTP – Secure RTP, profile for secure RTP communication which supports encryption, message authentication and integrity. Widely used in both IP Video and Voice implementations.
We are looking froward for further implementations of this new technology. Things are getting very interesting in visual communication products.
—
More about Facetime:



This is good news for interoperability as Skype is a closed protocol, a proprietary video codec and they don’t have any video peering policy. Now the real questions starts
- is there “proprietary” extension used (e.g. for SIP registration or location of other parties)
- how are the iPhone user registering and what infrastructure hare they using.
- will apple have a peering policy to other SIP provider.
On the last one, I am afraid that they will be happy to remain in their closed garden.
Apple's standards are open. But do the FaceTime use a centralized server? Or is it jabber-like, allowing to use multiple servers?
As far as I know, H.264 is not as "open" as it seems: at least it is NOT royalty-free, and I think its widespread usage would lead to the disappearance of free (as in FREEDOM) platforms. And I don't want to live in a world were I the only choice I have so as to speak (or videochat or whatever…) to a friend is between a bunch of closed, over-constrained and right-restricting software and hardware!
Any idea how Apple manages to have a real-time video encoding over relatively low-end A4 processor? Or are they using other extra graphics hardware?
Nice technology. FaceTime is really an innovative feature.