Beyond VoIP Protocols by Olivier Hersent, Jean-Pierre Petit, and David
  Gurle, Wiley, 2005.
This book covers what needs to be done to advance beyond a basic, end-to-end
  Internet telephone system, particularly with respect to protocols.
  
Digital Telephony, third edition by John Bellamy, Wiley, 2000.
By the early 90s, about the same time the Internet was taking off as a public
  utility, the American public telephone system completed its change-over from
  analog to digital within the circle of central offices.  This book describes
  what the change-over from analog to digital entailed.
  
Digital Telephony Over Cable by D. R. Evans, Addison Wesley, 2001.
Covers PacketCable, the Cable Consortium’s set of standards specifying
  a two-way digital communications system for cable TV systems.  
  
Internet Telephony edited by Lee McKnight, William Lehr and David Clark,
  MIT Press, 2001.
A somewhat risky book that tries to think its way into the future of
  Internet-based telephony and communication systems more generally.
  “What goes around comes around” is probably the most useful thing to be
  thinking while reading this book.
  
Signaling and Switching for Packet Telephony by Matthew Stafford, Artech
  House, 2004.
What can be done once the bearer and control planes are separated into
  independent devices.
 
Voice over IP by Uyless Black, Prentice Hall, 2002.
A good introductory book, reasonably complete and occasionally deep.  It will
  get you oriented in the VoIP landscape and sets you up to explore further.
  
Voice over IP Fundamentals by Jonathan Davidson and James Peters, Cisco
  Press, 2000.
A book published, you will have noticed, by Cisco designed to make technical
  managers comfortable and adept at constantly shoveling out more budget for
  Cisco boxes one bigger than the ones they’ve already
  got.
 
VoIP Hacks by Ted Wallingford, O’Reilly, 2006.
A hodgpodge of tips & tools for Internet telephony.
 
   
   
   
   
  
  Some of these papers are freely available, some require registration,
    which you get automatically if you access the link from within
    the monmouth.edu domain.  If you're not within
    the monmouth.edu domain and can't get there, you have to be a
    member of the ACM or IEEE (depending on the paper) digital library.
An 
  Architecture for Residental Internet Telephony Service 
  by Christian Huitema, Jane Cameron, Petros Mouchtaris and Darek Smyk
  in IEEE Internet Computing, May-June 1999 (v. 3, n. 3).
An internet-telephony archiecture should be able to handle millions of
  end-points, integrate seamlessly with the public telenephone network (PTN)
  including SS7 support, and be as reliable as the PTN.  Given the
  dissimilarites between the Internet and the PTN, the architecture should be
  gateway-based, including a residential gateway, a trunking gateway, user
  agents, and the usual media gateways.
  
An 
  Architecture for Secure VoIP and Collaboration Applications
  by Dimitris Zisiadis, Spyros Kopsidas and Leandoros Tassiulas
  in the Third International Workshop on Security, Privacy and Trust
  in Pervasive and Ubiquitous Computing, 19 July 2007.
VoIP and collaboration Internet applications usually require registration in
  a central user database and use either two bridged client-server connections
  between the end users and the server or they allow direct client connections.
  Biometric-based procedures followed by the VoIPSec (voice interactive
  personalized security) protocol can provide end-to-end security for such
  applications.  This approach doesn’t need a trusted third-party
  authentication authority.
  
Anti-Vamming Trust Enforcement in Peer-to-Peer VoIP Networks
  by Nilanjan Banerjee, Samir Saklikar and Subir Saha
  in Proceedings of the 2006 International Conference on Wireless
  Communications and Mobile Computing.
I send you a letter and seal it with a wax imprint.  You trust the letter
  came from me because the name and wax imprint match. Let my name be a bit
  string n and the wax imprint be another bit string w with the
  property that prefix(h(w), t) = prefix(n, t).  prefix(b, n)
  is the first (leftmost) n bits from the bit string b, h()
  is a secure hash function, and t is a non-negative integer.  Because
  h() is impossible to invert, finding a wax imprint for which t
  is large is expensive; wax imprints with large t values are more
  trustworthy (in some sense) than wax imprints with small t values.
  Using a public key from a public-key cryptosystem as my name provides
  authentication by encoding the wax imprint with my private key.
Building Trustworthy Systems: Lessons from the PTN and Internet by Fred
  Schneider, Steven Bellovin and Alan Inouye in IEEE Internet Computing, November-December
  1999 (v. 3, n. 6).
The Internet and the public telephone network (PTN) have different ways of
  being attacked; skills learned on one network don’t transfer to the
  other.  However, their increasing integration makes each an ingress for
  attacks on the other.  The PTN’s eroding monopoly status and the
  Internet’s increasing commercialization gives rise to a cloud of
  diverse, minimally-cooperative agents whose actions make matters worse.
  What can go wrong is well known; what is to be done isn’t clear.
  
Critical VPN Security Analysis and New Approach for Securing VoIP
  Communications over VPN Networks 
  by Wafaa Diab, Samir Tohme and Carole Bassil
  in Proceedings of the 3rd ACM Workshop on Wireless Multimedia Networking
  and Performance Modeling.
Many VoIP security attacks can be frustrated using encryption.  VPN is a
  standard mechansim for encrypting on the Internet, but is oriented toward
  non-real-time data streams. VPN encryption for VoIP should support real-time
  traffic using IP Security mechanisms and guarantee the performance and
  quality of services without reducing the effective bandwidth.
Decentralizing SIP
  by David Bryan and Bruce Lowekamp
  in ACM Queue, March 2007 (v. 5, n. 2).
A peer-two-peer (p2p) overlay network responds naturally to network
  connectivity and membership changes at the cost of introducing uncertainty
  about network state.  Hybrid p2p networks impose some structure - using, for
  example, a distributed hash table - to reduce the uncertainty at a cost of
  increasing the effort required to maintain the network.  Session Initiation
  Protocol (SIP) overlay networks are mostly distributed except for a few
  centeralized services such as registration.  Moving a SIP network to a p2p
  network would make formally centeralized services unacceptably expensive, but
  a hybrid p2p network may provide an appropriate trade-off between the ability
  to react naturally to network-configuration changes and the cost of providing
  formally centralized services.  
The 
  Delay-Friendliness of TCP
  by Eli Brosh, Salman Abdul Baset, Dan Rubenstein and Henning Schulzrinne
  in Proceedings of the 2008 ACM SIGMETRICS International Conference on
  Measurement and Modeling of Computer Systems.
Despite admonishions not to, many real-time Internet applications use TCP for
  data transport.  How does that work out for them?  A Markov-chain model
  validated by simulatons on networks shows that low packet-loss rates produce
  small (< 1 sec.) TCP delays, as the loss rate increases the RTT should
  decrease to compensate, and that large streams (500 Kb/s video) are more
  effected than small streams (64 kb/s audio).  Also, apart from the usual
  parameter games (big window size, no Nagel, no byte counting, use SACK and so
  on), splitting large packets into small ones may help the stream but may hurt
  the network and using parallel streams helps muchly.  
The 
  Economics of the Internet: Utility, Utilization, Pricing and Quality of Service
  by Andrew Odlyzko, AT&T Research, 7 July 1998.
Can throwing bandwidth at the Internet solve congestion problems?  Can it
  solve congestion problems as efficiently and effectively as other approaches,
  such as various quality of service (QoS) regimes?  Many people say no, but
  it’s not clear why that’s the correct answer.  
The 
  Effect of Packet Dispersion on Voice Applications in IP networks by Hanoch
  Levy and Haim Zlatorkrilov in IEEE/ACM Transactions on Networking, April 2006 (v. 14, n. 2).
Defines the noticeable packet loss (NPL) metric which weights packet
  loss occurring close together over dispersed packet loss (that is, bursty
  over Bernoulli loss) and then models how packet loss under dispersed packet
  routing effects NPL.  Packets are distributed among routes randomly,
  cyclically, or round-robin.  Route diversity does improve NPL, but the
  assumptions used (particularly for independence and receive-side packet
  handling) to carry the analysis gives one pause.
  
Enabling SIP-Based Sessions in Ad Hoc Networks 
  by Nilanjan Banerjee, Arup Acharya and Sajal Das
  in Wireless Networks, August 2007 (v. 13, n. 4).
Session Initiation Protocol (SIP) servers running in the Internet have a
  relatively stable infrastructure on which to build an overlay network for
  endpoint discovery and session establishment.  Ad hoc networks do not provide
  a stable infrastructure and require extra techniques to support SIP-based
  overlay networks.  One technique, the loosely coupled approach, relies on the
  underlying ad-hoc routing and provides endpoint discovery.  Another
  technique, the tightly coupled approach, includes session establishment by
  defining a virtual topology among clusters of end-points.  Simulations show
  that tight coupling is better (has lower latency) in stable networks while
  loose coupling is better in dynamic networks.  In all cases the extra
  structure provided by tight coupling has is less control overhead than does
  loose coupling.
  
End-To-End Arguments in System Design 
  by Jerome Saltzer, David Reed and David Clark
  in ACM Transactions on Computing Systems, November 1984 (v. 2, n. 4).
What services should a network provide?  The end-to-end argument
  answers this question by assuming each service added to the network is
  enormously expensive and requires showing that the enormous expense will be
  amortized over all network users.  If that totalizing amortization
  can’t be carried out, the feature doesn’t belong in the
  network.
  
From POTS to PANS: A Commentary on the Evolution to Internet Telephony by
  Christos Polyzois, Hal Purdy, Ping-Fai Yang, David Shrader, Henry Sinnreich,
  François Ménard and Henning
  Schulzrinne in IEEE Internet Computing,
  May-June 1999 (v. 3, n. 3).
The Internet has a structure significantly different from that of the public
  telephone network (PTN), both in the network and at the end-points.  At least
  initially, the Internet phone services will echo those of the PTN, raising
  the question what should be brought over from the PTN and what should be
  reconsidered anew.  The PTN’s Intelligent Network infrastructure is
  the most likely contact point for IP networks, both as a way to use
  existing PTN services and functions and as a way to hook in new
  Internet-based services.
  
Guaranteeing Multiple QoSs in Differentiated Services Internet
  by Hoon Lee and Hyejin Kwon and Yoshiaki Nemoto
  in Proceedings of the Seventh International Conference on Parallel and
  Distributed Systems.
An architecture to guarantee multiple Quality of Services (QoSs), including
  the IETF’s Differentiated Service (DiffServ) architecture and the user
  application’s requirements.  A prioritized packet service scheme using
  weighted round-robin in the core router supports weighted priority services
  for the three IETF service classes: EF (Expedited forwarding), AF (Assured
  forwarding) and DF (Default forwarding).
  
Holistic VoIP Intrusion Detection and Prevention System
  by Mohamed Nassar, Saverio Niccolini, Radu State and Thilo Ewald
  in Proceedings of the First International Conference on Principles,
  Systems and Applications of IP Telecommunications, 2007.
Bruce Schneier often points out that several
  flexible, lightweight security layers often combine to provide better
  overall security than does a single, heavily armored bastion.  Holistic VoIP
  security illustrates Schneier’s point by using two layers to provide
  VoIP security.  The first layer is a VoIP honeypot to collect and analyze
  data on attacks.  The second layer is an event correlater that observes a
  working VoIP system and flags operation sequences that seem suspicious.
  
 
Integrating Internet Telephony Services by Wenyu Jiang, Jonathan Lennox,
  Sankaran Narayanan, Henning Schulzrinne, Kundan Singh and Xiaotao Wu 
  in IEEE Internet Computing, May-June 2002 (v. 6, n. 3).
Cinema (Columbia Internet extensible multimedia arechitecture) is a SIP-based
  subsystem that hosts various multimedia facilities such as conferencing
  (bridging), streaming media, unified voice messaging, and address resolution.
  Cinema integrates with existing voice networks and end-points via SIP
  proxies and gateways.   
  
Integration of Call Signaling and Resource Management for IP Telephony by
  Pawan Goyal, Albert Greenberg, Charles Kalmanek, 
  William Marshall, Partho Mishra, Doug Mortz and K. Ramkrishnan
  in IEEE Internet Computing,
  May-June 1999 (v. 3, n. 3).
An IP network usually has computing devices of varying power serving as
  end-points and network nodes.  A signaling architecture for such a network
  should be distributed so work can be performed at the most appropriate
  location and open so new services and old services re-implementations can
  be easily added.  Distribution requires scheduling to determine which
  locations are appropriate and to dispatch work to those locations; QoS issues
  — such as packet loss, delay, and jitter
  — can be a first-cut driver for making scheduling
  decisions.  
  
A 
  Modular Architecture for Providing Carrier-Grade SIP Telephony Services
  by Hechmi Khlifi and Jean-Charles Grégorie in the
  Third IEEE International Converence on Wireless and Mobile Cmmputing.
A modular, flexible and scalable architecture to provide mass-market
  telephony services services in SIP environments.  The architecture uses
  Parlay, a standard, object-oriented and signaling protocol-neutral API,
  and SIP to separate application logic and network function and, at the
  network level, signaling and media processing.  
Peer-to-Peer Internet Telephony Using SIP
  by Kundan Singh and Henning Schulzrinne
  in Proceedings of the International Workshop on Network and Operating 
  Systems Support for Digital Audio and Video, 13–14 June 2005, pages
  63–68.
Internet telephony (IT) networks embedded in the Internet have the usual tree
  hierarchy structure.  An alternative structure flattens IT subtrees (domains)
  into a peer sets with no hierarchy.  A flat domain should improve reliability
  and change accommodation while making it harder to find resources. Session
  Initiation Protocol servers in a flat hierarchy can run a peer-to-peer (P2P)
  network protocol, such as Chord or Content-Addressable Network, to organize
  themselves.  However, typical P2P services are latency tolerant and exploit
  resource replication while IT services are latency intolerant and
  can’t easily replicate many resources (end users and databases, for
  example).  P2P security and economics models also match poorly with the
  equivalent IT models.  
Programming Internet Telephony Services by Jonathan Rosenberg, Jonathan
  Lennox and Henning Schulzrinne in IEEE Internet Computing, May-June 1999 (v. 3, n. 3).
A control plane full of SIP servers can be induced to provide new services
  using a CGI-like mechanism. New services are implemented as programs
  independent of SIP servers and then invoked as independent processes by SIP
  servers when the service is needed.  A call-processing language,
  circumscribed in its abilities to limit dangerous operations and to make it
  statically checkable, makes it possible for end-users to implement custom
  services.
  
Providing Emergency Services in Internet Telephony by Henning Shulzrinne and
  Knarig Arabshian in IEEE Internet Computing, May-June 2002 (v. 6, n. 3).
Emergency communications systems impose new requirements, such as universal
  numbering, call routing, and caller number and location identification, as
  well as the usual performance and reliability requirements on IP-based
  voice-service networks.  Replicating the emergency PSTN architecture is
  (relatively) straightforward, but an IP network’s modular,
  service-based structure allows for new architectures with better flexibility
  and scalability.
  
Real-Time Voice Communication over the Internet Using Packet Path Diversity
  by Yi Liang, Eckehard Steinbach and Bernd Girod
  in Proceedings of the Ninth ACM International Conference on Multimedia,
  pages 431–440.
The quality of real-time voice communication over best-effort networks is
  mainly determined by the delay and loss characteristics observed along the
  network path.  Excessive playout buffering at the receiver is prohibitive and
  significantly delayed packets have to be discarded and considered as late
  loss. We propose to improve the tradeoff among delay, late loss rate, and
  speech quality using multi-stream transmission of real-time voice over the
  Internet, where multiple redundant descriptions of the voice stream are sent
  over independent network paths.  Scheduling the playout of the received voice
  packets is based on a novel multi-stream adaptive playout scheduling
  technique that uses a Lagrangian cost function to trade delay versus loss.
  Experiments over the Internet suggest largely uncorrelated packet erasure and
  delay jitter characteristics for different network paths which leads to a
  noticeable path diversity gain.  We observe significant reductions in mean
  end-to-end latency and loss rates as well as improved speech quality when
  compared to FEC protected single-path transmission at the same data rate.  In
  addition to our Internet measurements, we analyze the performance of the
  proposed multi-path voice communication scheme using the ns network simulator
  for different network topologies, including shared network links.  
  
SCTP: A Proposed Standard for Robust Internet Data Transport
  by Armando Caro, Jr., Janardhan Iyengar, Paul Amer, Sourabh Ladha, Gerard
  Heinz, II and Keyur Shah
  in IEEE Computer, November 2003 (v. 36, n. 11).
The Stream Control Transmission Protocol (SCTP) provides associations
  between processes on hosts; each association contains one or more
  unidirectional streams.  SCTP provides flow- and congestion-controlled
  reliable packet transport; each packet is mixture of control and data blocks.
  SCTP end-points can straddle several ports on each host; set-up uses a
  four-way handshake to avoid syn attacks and a three-way tear-down for speed
  (and eliminating TCP’s half-close semantics).
  
Security Issues with the IP Multimedia Subsystem (IMS) by Michael Hunter,
  Russ Clark and Frank Park in Workshop on Middleware for Next-generation
  Converged Networks and Applications, Newport Beach, California, 26–30 November 2007.
The Internet Multimedia Subsystem (IMS) is designed to support convergent
  services comprising voice and data.  IMS security and related covers all the
  usual suspects (QoS, billing, services, regulation, security) from the
  providers’ and users’ perspectives.
  Apart from a new, more complex architecture, IMS-relevant consideration of
  these areas will be familiar to those with experience in other areas of
  Internet-based subsystem design.  
  
Security Patterns for Voice over IP Networks by Eduardo Fernandez and
  Juan Pelaez and Maria Larrondo-Petrie in Proceedings of the International
  Multi-Conference on Computing in the Global Information Technology,
  4–9 March, 2007.
The grand convergence of voice, video and data on VoIP networks is a source
  of great hope, but also a source of security concerns do to the lack of
  isolation between the bit streams.  Various system structures, described as
  software patterns can re-establish isolation to improve security.  The
  patterns involve encryption, network segmentation, tunneling, and
  authentication.
  
The 
  Session Initiation Protocol: Internet-Centric Signaling by Hennig
  Schulzrinne and Jonathan Rosenberg in IEEE Communications, October 2000
  (v. 38, n. 10).
The Session Initiation Protocol (SIP) provides signaling and control for
  multimedia services.  SIP locates resources based on a location-independent
  name and negotiates session characteristics.  It can be used for Internet
  telephony and conferencing, instant messaging, event notification, and the
  control of networked devices.  SIP is a typical IETF protocol: text-based,
  line-oriented, request-response.  Designed to be extensible, SIP has been
  extended in several ways to define new services (instance messaging, for
  example) and features (authentication, for example).  
  
A 
  SIP-Based Conference Control Framework
  by Petri Koskelainen, Henning Schulzrinne and Xiaotao Wu
  in Proceedings of the 12th International Workshop on Network and   
  Operating Systems Support for Digital Audio and Video.
Conference services in Internet-telephony (IT) systems should be implemented
  in a way consistent with IT to reap the benefits of such systems.  SIP-based
  coordination using SOAP provide the mechanisms for conference and floor
  control. Central SIP servers and unicast should be good enough for small
  conferences, but larger conferences probably require distributed servers or
  multicast or both.
  
SOVoIP: Middleware for Universal VoIP Connectivity
  by M. J. Arif and S. Karunasekera and S. Kulkarni
  in 8th ACM/IFIP/USENIX International Conference on Middleware.
VoIP has a number of protocols that don’t interoperate, but instead
  are coordinated by protocols such as SIP or H323.  For some reason, SIP or
  H.323 don’t look enough like middleware, so maybe they can be replaced
  (or suplimented, it isn’t clear) by CORBA or web services.  Naturally
  CORBA is right out, because of its firewall difficulty and performamce,
  leaving web services in the form of Service Oriented VoIP (SOVoIP).  Just to
  make sure, SOVoIP performs better than CORBA, but no comparisons are made
  with SIP or H.232.
  
Terminating Telephony Services on the Internet by
  Vijay Gurbani and Xian-He Sun in IEEE/ACM Transactions on Networking, August
  2004 (v. 12, n. 4).
How to originate a service in the telephone network and terminate it in an
  Internet-based network using standard protocols (SIP, HTTP, XML) and a
  publish-subscribe architecture.  The desire to avoid middleware is admirable,
  but requiring direct access to signaling is troubling.  It’s also
  unclear whether the same architecture can apply in the Internet-to-telephone
  direction.  
  
Time Synchronization for VoIP Quality of Service 
  by Hugh Melvin and Liam Murphy
  in IEEE Internet Computing, 
  May-June 2002 (v. 6, n. 3).
Effectively handling time-sensitive voice playout over the Internet requires
  good and stable information about end-to-end delays.  Relatively simple
  estimation at the receiver’s end works well as long as the estimates
  don’t drift too rapidly.  Time synchronization via GPS provides a
  uniform, stable time signal end-points can use to produce accurate, stable
  delay measurements.
  
Towards a new Security Architecture for Telephony 
  by Carole Bassil, Ahmed Serhrouchni and Nicolas Rouhana
  in Proceedings of the International Conference on Networking,
  International Conference on Systems and International Conference on Mobile
  Communications and Learning Technologies (ICNICONSMCL ’06).
The telephone and VoIP networks place different emphasis in their security
  policies and use different machanisms to acheive their policies.  This
  difference is yet another gap that has to be bridged in the networks’
  grand convergence.  However, rather than using gateways to translate between
  the security mechanisms, a shim layer in each network protocol stack would
  allow each security mechanism to be translated to a common mechanism
  providing a secure end-to-end voice communication.
  
Tussle in Cyberspace: Defining Tomorrow’s Internet by
  David Clark, John Wroclawski, Karen Sollins and Robert Braden in IEEE/ACM Transactions on Networking, June
  2005 (v. 13, n. 3).
A tussle is a clash of interests among competing parties in a system.
  The Internet was designed and implemented in a relatively tussle-free
  environment; however, the Internet’s current popularity and importance
  has increased the number and diversity of competing parties and greatly
  increased the number of tussles, making the original design principles less
  useful then they once were.  New design principles should recognize and
  identify places where tussles may occur and support late binding to allow a
  range of possible resolutions.  
  
Ubiquitous Computing using SIP
  by Stefan Berger, Henning Schulzrinne, Stylianos Sidiroglou and Xiaotao Wu
  in Proceedings of the 13th International Workshop on Network and
  Operating Systems Support for Digital Audio and Video.
The Session Initiation Protocol (SIP) is an open, extensible, distributed,
  request-response infrastructure.  Extending a SIP-based communication system
  with user-location information allows for services that follow you around and
  customize themselves to your location.  Such an extension requires a
  subsystem for discovering user location, a subsystem for managing location
  information, and a subsystem for reacting to location state.
  
Unified Communications with SIP
  by Martin Steinmann
  in ACM Queue, March 2007 (v. 5, n. 2).
Proprietary PBXs are disappearing because standard and open-source
  Internet-telephony software, such as SIP, can provide similar services more
  flexibility and less cost, and are easy to extend to provide new services.  
  
A 
  Voice Over IP Service Architecture for Integrated Communications by Daniele
  Rizzetto and Claudio Catania in IEEE Internet Computing, May-June 1999 (v.3, n. 3).
The unification of voice and data traffic in the Internet overshadows an
  increasing separation between the control and data parts of the network.
  Emphasizing the control-data separation can make it simpler to efficiently
  implment advanced services, as well as well as isolate each part from
  technological change in the other part.  A service architecture providing an
  abstract API to control network preserves the advantages of separated control
  and data.  
  
VoIP Security and Privacy Threat Taxonomy,
  VOIPSA, 24 October 2005.
All (most? some? a few?) of the goblins that could get you if you
  don’t watch out.  
  
VoIP Security: Not an Afterthought  by Douglas Sicker and Tom Lookabaugh in
  ACM Queue, September 2004 (v. 2, n. 6).
The things that make VoIP interesting and important —
  distributed operation, flexibility, openness — also makes
  it hard to secure.  One advantage is an Internet base, which come with
  existing relevant security work and research.  
  
VoIP: What is it Good For?  by Sudhir Ahuja and Robert Ensor in ACM Queue,
  September 2004 (v. 2, n. 6).
A brief, high-level comparison between service implementation in the PSTN and
  over VoIP networks, mostly to the favor of VoIP networks.  Recognizes the
  importance of service development the the future of VoIP networks, but then
  presents lame examples (click-to-dial web page links, persistent chat).
  
You Don’t Know Jack About VoIP by Phil Sherburne and
  Cary Fitzgerald in ACM Queue, September 2004 (v. 2, n. 6).
Voice over Internet shows great promise due to network flexibility and
  openess, but also presents great challenge given the service requirements for
  good quality voice traffic, as well as management and security requirements.
  
  
  | This page last modified on 14 August 2008. | 
    
    
    
    
    
   |