并行架构与编程(六)scaling a website

Scale-out Parallelism, Elasticity, and Caching.

  • focus
    • parallelism
    • locality
  • metric
    • throughput
    • latency

1. setting worker number for web server

  • Parallelism: use all the server’s cores
  • Latency hiding: hide long-latency disk read operations
  • Concurrency:
    • many outstanding requests;
    • service quick requests
  • Footprint: don’t want too many threads so that aggregate working set of all threads causes thrashing

2. dynamic content

dynamic content: response is not static page but the result of application logic running in response to a request.

2.1. scale for higher throughput

scripting language performance is terrible, therefore we can try scale the web servers for higher throughput

requests -> load balancer -> multiple web server <-> database

  • prob: how to deal with consistency state of sessions
    • approach 1: session affinity
      • all requests associated with a session are directed to the same server
      • load balancing is poor
      • do not have to change web-application design to implement scale out
    • approach 2: persist session state into database, namely stateless servers
      • load balancing is not a problem
      • web server is easy to scale
      • db becomes the bottleneck

  • prob: scaling out a databse
    • replicate: read-only replicate data and parallelize reads
      • write is still a problem
      • consistency issue
    • partition

2.2. how many web servers do we need

  • observ: site load is bursty
    • provisioning site for the average case load -> poor quality of service during peak usage
    • provisioning site for the peak case load -> many idle servers most of the time
  • solution: elasticity
    • foundation: server is stateless
    • sites automatically adds or removes web from/to worker pool based on measured load
    • source of servers available on-demand: aws, Azure, etc

2.3. reuse and locality

  • cache: cache commonly accessed objects
    • between web server and database
    • cache consistency problem

  • front-end cache: cache web server responses
    • reduce load on web servers
    • between load balancer and web server

  • content distribution networks(CDN): images and videos
    • redirect request url to CDN servers based on locality information
      • higher bandwidth
      • lower latency