...
Fastly is CDN (content delivery network) provider. We use CDN to provide greater rechability by take advantage of their cache nodes content caching nodes spread all over the world.
Services that are hosted in Fastly are
- md.seamlessaccess.org
- service.seamlessaccess.org
- use.thiss.io, md.thiss.io
- www.seamlessaccess.org
The configuration of these services reside here https://manage.fastly.com/services/all
Fastly monitors the status of our load balancer of thiss-js servers by sending GET /manifest.json requests to them.
Fastly monitors the status of our load balancer of thiss-js servers by sending GET /manifest.json requests to them.
Troubleshooting
- How to check cache for an URL-
| Code Block |
|---|
https://docs.fastly.com/en/guides/checking-cache, curl -I -H"Fastly-debug:1" https://service.seamlessaccess.org |
- How to check cache in an Edge node of Fastly, you need to create an API key for this
| Code Block |
|---|
curl -s "https://api.fastly.com/content/edge_check?url=https://service.seamlessaccess.org/990.js" -H 'Fastly-Key:xxx' |
- Check origin latency for each service https://manage.fastly.com/configure/services/
- Check stats https://manage.fastly.com/observability/dashboard/system/overview/historic
Access
https://wiki.sunet.se/pages/viewpage.action?pageId=83493119
Internal Components
Aggregator & Publisher
Servers
We have added our HAproxy load balancers as backends for the SeamlessAccess services in Fastly. The cache nodes of Fastly load balances between our HAproxy servers by forwarding requests from users to them. Over time, the speed and availability of the replies do not depend on our HAproxy load balancers any more as they are saved in nearest Fastly cache servers where the users are located and served to the users with minimum time. In addition to this, Fastly user interface offers statistical data with graphical presentation that shows the hits, misses, errors of the HTTPS requests sent to our HAproxy servers, both in real time and historically. It is good to have a look at this data to see a picture of how much traffic and requests are handled by Fastly and our servers.
Services that are hosted in Fastly are
- md.seamlessaccess.org
- service.seamlessaccess.org
- use.thiss.io, md.thiss.io
- www.seamlessaccess.org
The configuration of these services reside here https://manage.fastly.com/services/all
For md.seamlessaccess.org, Fastly monitors the status of our are load balancer for thiss-js servers by sending GET /manifest.json requests to them.
For service.seamlessaccess.org, Fastly monitors the status of our load balancer for thiss-mdq servers by sending HEAD /status requests to them.
If a load balancer is marked unhealthy due to health checks, Fastly will stop attempting to send requests to it.
Troubleshooting
- How to check cache for an URL-
| Code Block |
|---|
https://docs.fastly.com/en/guides/checking-cache, curl -I -H"Fastly-debug:1" https://service.seamlessaccess.org |
- How to check cache in an Edge node of Fastly, you need to create an API key for this
| Code Block |
|---|
curl -s "https://api.fastly.com/content/edge_check?url=https://service.seamlessaccess.org/990.js" -H 'Fastly-Key:xxx' |
- Check origin latency for each service https://manage.fastly.com/configure/services/
- Check stats https://manage.fastly.com/observability/dashboard/system/overview/historic
Access
https://wiki.sunet.se/pages/viewpage.action?pageId=83493119
Internal Components
Aggregator & Publisher
Servers
| Name | Location | Env |
|---|---|---|
| meta.aws1.geant.eu.seamlessaccess.org | Frankfurt, AWS | Production |
| meta.aws2.geant.eu.seamlessaccess.org | N. California, AWS | |
| Name | Location | Env |
| meta.aws1.geant.eu.seamlessaccess.org | Frankfurt, AWS | Production |
| meta.aws2.geant.eu.seamlessaccess.org | N. California, AWS | |
| meta.ntx.sunet.eu.seamlessaccess.org | Nutanix, SUNET | |
| meta.se-east.sunet.eu.seamlessaccess.org | STO1v2, Safespring | |
| a-1.thiss.io | STO1v2, Safespring | Beta |
| a-staging-2.thiss.io | STO1v2, Safespring | Staging |
...
- Monitor the date when the metadata JSON files are last modified from the
https://<site link>/manifest.json - SSL check and availability of the site links
- The string 'OK' is found in
https://<site link>/status - Monitor that both backends are up by checking HAproxy stats from
http://<site link>:8404/stats.This link is accesible only by SUNET VPN for SUNET NOC members and the monitor server.
The site links are
https://md.ntx.sunet.eu.seamlessaccess.org/
https://md.se-east.sunet.eu.seamlessaccess.org
https://md.aws1aws2.geant.eu.seamlessaccess.org
https://md.aws2aws1.geant.eu.seamlessaccess.org
Take help of GeneralTroubleshooting for fixing alarms. It may happen that thiss-mdq servers are unavailable which will cause alarm in HAproxy servers, then check the section for thiss-mdq servers to troubleshoot them.
...
The HAproxy service runs in a docker container and the configuration of it is supplied by puppet manifests.
Mointoring
...
Upgrade
Monitor
Server
...
& Troubleshooting
Mointoring
Upgrade
Demo Application
Server
Descripton & Troubleshooting
Mointoring
Upgrade
Use of SUNET INFRA cert
add details
SeamlessAccess SUNET INFRA cert update
Use of Fleetlock
General Troubleshooting
Almost all services run in docker containers. They are addes as systemd units. The names start with sunet-*.
journalctl -fu <service name of the system unit>- Check
/var/log/syslogfor older logs docker logs -f <docker container name>service <service name of the system unit> restart
For deeper troubleshooting knowledge of SUNET's puppet & cosmos structure is needed as mentioned in the Prerequisites section above.
We have below checks for these load balancers for each site in https://monitor.seamlessaccess.org/
- SSL check and availability of the site links
- Monitor that both backends are up by checking HAproxy stats from
http://<site link>:8404/stats.This link is accesible only by SUNET VPN for SUNET NOC members and the monitor server.
The site links are
https://static.ntx.sunet.eu.seamlessaccess.org
https://static.se-east.sunet.eu.seamlessaccess.org
https://static.aws1.geant.eu.seamlessaccess.org
https://static.aws2.geant.eu.seamlessaccess.org
Take help of GeneralTroubleshooting for fixing alarms. It may happen that thiss-js servers are unavailable which will cause alarm in HAproxy servers, then check the section for thiss-js servers to troubleshoot them.
Upgrade
SeamlessAccess HAproxy Upgrade
Monitor
Server
| Name | Location | Env |
|---|---|---|
| monitor.ntx.sunet.eu.seamlessaccess.org | STO1v2, Safespring | Production |
Descripton
This is a monitor server which runs Nagios4 to monitor the health and operations of the virtual servers in Production, Beta and Staging. The GUI is here https://monitor.seamlessaccess.org/.
Infromation regarding access is given here https://wiki.sunet.se/display/sunetops/Monitoring.
Mointoring & Troubleshooting
- Run
service nagios4 statusto check the status - Run
service nagios4 restartto restart the application. - Check logs in
/var/log/nagios4/nagios.log - Many of the configurations from
/etc/nagios4/are controlled by puppet manifests.
Upgrade
No proper guide is available. It is usually upgrade when there's a newer version of Nagios available when we upgrade the OS of the server.
Demo Application
Server
| Name | Location | Env |
|---|---|---|
| sp-test.seamlessaccess.org | STO1v2, Safespring | Mixed |
Descripton
This server runs Demo SP (service provider) applications for both Production and Beta. They are exposed in respectively https://demo.seamlessaccess.org/ and https://demo.beta.seamlessaccess.org.
The application runs in a docker container.
Mointoring & Troubleshooting
No specific monitoring is done for this service.
GeneralTroubleshooting can be used for troubleshooting.
Upgrade
By setting the version parameter in thiss-ops/global/overlay/etc/puppet/cosmos-rules.yaml or in the thiss-ops/global/overlay/etc/puppet/modules/thiss/manifests/demo_sp.pp.
General Troubleshooting
Almost all services run in docker containers. They are addes as systemd units. The names start with sunet-*.
journalctl -fu <service name of the system unit>- Check
/var/log/syslogfor older logs docker logs -f <docker container name>service <service name of the system unit> restart
For deeper troubleshooting knowledge of SUNET's puppet & cosmos structure is needed as mentioned in the Prerequisites section above.
The puppet manifests that deploy and manage the internal components are found here https://github.com/TheIdentitySelector/thiss-ops/tree/master/global. Those who have write acces to it are mentioned here https://wiki.sunet.se/pages/viewpage.action?pageId=83493119
Use of SUNET INFRA cert
SUNET has its own CA, http://ca.sunet.se/infra/, for internal use.
We use them in below servers.
| Servers | Purpose |
|---|---|
| HAproxy Load balancers | For authentication with Fastly |
| Aggregators & Publishers | For authentication with thiss-mdq servers |
We monitor the expiry of these certificates in https://monitor.seamlessaccess.org/
Follow SeamlessAccess SUNET INFRA cert update to update them.
Use of Fleetlock
We use SUNET Fleetlock service in our virtual machines to coordinate upgrades/reboots which are controlled by running of cosmos so only a given number of machines do it at the same time.
We have configured so that cosmos can run one at a time in Aggregataors, one thiss-js and one thiss-mdq server per site and two HAproxy servers at a time.
After a successful cosmos run in a server, this application runs specific health checks to see that the service running in that server is healthy and let other servers in the same group run cosmos and perform health checks.
The puppet manifests that deploy and manage the internal components are found here The relevant configurations can be found in https://github.com/TheIdentitySelector/thiss-ops/tree/master/global. Those who have write acces to it are mentioned here master
Read about Fleetlock, https://wiki.sunet.se/pages/viewpage.action?pageId=83493119147522142.
Firewall Restrictions
Staging Metadata Service
SUNET only hosts the MDQ service for staging which is https://md-staging.thiss.io. It is served by SUNET load balancers.
The Discovery service for Staging which is https://staging.build.thiss.io is not managed by SUNET.
Access to Internal Components
...