JUNOS: IS-IS STUDY NOTES, PART 2 – FOR JUNIPER’S JNCIS-SP and JNCIS-ENT EXAMS

This is part two of my Beginners Guide to IS-IS, for people studying towards their Juniper JNCIS-SP and JNCIS-ENT certifications – and for people who just enjoy learning awesome new things! If you’ve not read part one yet, stop! Click here and read this, and then come back. Don’t worry: I’ll wait.

Done? Marvellous! In that Part 1 you learned the basics of IS-IS. You learned what a level is, what an LSP is, and you also learned how to do a very basic configuration. Let’s take our knowledge, and our love, to the next level.

 

WHAT DOES AN LSP LOOK LIKE? AND WHAT IS A TLV?

You’ll remember from Part 1 that LSP stands for Link-State PDU. Every router generates an LSP which gets flooded throughout the network, which lets all other routers know what area the router is in (we’ll come back to that), what routers it connects to, what IPs it knows about, and what the “metric” (what OSPF would call the cost) is to get to them.

Let’s look at an IS-IS LSP, and learn why they’re a bit more “chillaxed to the max” than OSPF LSAs.

You see, OSPF LSAs have a very strict format. They always look the same, and if a router ever receives anything different then the router will immediately explode into flames (I expect), or at the very least it will reject the LSA, and possibly tear down the adjacency. This initially made it difficult to extend OSPF with other functionality, although “opaque LSAs” were introduced later on that fix this.

By contrast, IS-IS LSPs are actually quite flexible, in that the information in an LSP, or any of the other messages that you’ll soon learn about, don’t have to be in any set order – and don’t even need to be understood by the receiving router! Instead, there’s a basic and short header that is always the same, followed by a variety of pieces of information that can be in any order. If a router doesn’t understand one of the pieces, it just ignores it and carries on – but the router will still re-advertise it to other neighbors, in case they understand it. Wow!

These chunks of data use a format called TLVs, which stands for Type Length Value. It’s a method that many protocols use to advertise information, where each piece of information is split up into three sections: the Type of information being advertised (this is just a number – each piece of information has a number associated with it); the total Length of the information, and then the actual data itself (the Value).

For example, here’s a packet capture of an IS-IS LSP taken by the mighty Jeremy Stretch at PacketLife. He’s made loads of packet captures that you can use in your studies.

Let’s break down what’s going on here. After the Ethernet headers (which is made up of the “IEEE 802.3 Ethernet” header and the “Logical-Link Control” header, because it uses old-style Ethernet), you’ll see two new headers.

The first one isn’t expanded out, but it says “ISO 10589 ISIS InTRA Domain Routeing Information Exchange Protocol“. This is a fixed header, but it’s honestly not very interesting. You can go and see it on Jeremy’s page if you like, but basically there’s nine fields, and the only interesting ones are the IS-IS version, and the type of message being sent. In this case it’s a Level 1 LSP.

Under that you see the thing we’re interested in, the PDU itself – the “ISO 10589 ISIS Link State Protocol Data Unit“.

The LSP itself has a short fixed header. Notice that it starts with the PDU Length, followed by a Remaining lifetime, and then the LSP-ID. As it happens, this router has a System ID of 2222.2222.2222. Jeremy will have configured this on the loopback of this router, as part of the ISO address. Notice that this ISO address is used to generate the name, in other words the LSP-ID, of the LSP itself. This is like how OSPF uses the Router ID to name some of its LSAs.

The very last number in an LSP-ID is usually 00. If it’s anything else then it means the LSP was fragmented, because it was too big to fit inside one PDU.

Next, the Sequence Number goes up every time something changes, for example if a prefix is added or removed on the router. After that there’s a Checksum (and a Checksum Status), and a Type block. This isn’t expanded out, but inside it you can see the Attached bit, which I’ll introduce you to in Part 3. Here it’s set to 0, which means it isn’t set.

After that, there’s a whole string of interesting info. This is where the TLVs begin.

Let’s focus in on one of them: the “Hostname” TLV. It has three fields: the Type (137, which means Hostname), the Length, and then the value itself. That’s a TLV! In this screenshot, this particular TLV is located in the middle of all the TLVs, but it doesn’t have to be. It could be anywhere in the LSP. You can see that this LSP comes from a router that is simply called R2.

We also see a TLV called “IS Reachability“. This lists all the routers that R2 is connected to. In this screenshot we see it’s connected to one neighbor, which is “3333.3333.3333.02”. You can probably guess from Jeremy’s naming convention that this is going to be R3! As it happens, that “02” at the end means that R2 is connected to the Designated Router (or the DIS, Designated Intermediate System, to be precise). We’ll talk about that in a moment.

You also see a TLV which contains all the IP ranges on this router. In this example, the “IP Internal reachability” TLV says there’s two IPs on this router: 10.0.10.0/32, and 192.168.10.0/24.

There’s even a “Protocols Supported” TLV. This one lists IPv4, but you could easily add IPv6 in there too.

Damn, that was pretty easy to read when you compare it to looking at an OSPF LSA, right? No obscure terminology to remember, no weird quirks, no guff about transit/stub/virtual/point-to-point for you to memorise. The IS-IS LSP just lays it out in a very readable format.

 

HELLO, CSNP AND PSNP: THE OTHER THREE MESSAGE TYPES

There’s four major types of message that IS-IS routers will send to each other. This is as opposed to the twenty million different messages and LSAs that OSPF seems to have.

(To be more precise, IS-IS technically has more message types than OSPF – but that’s only because Level 1 and Level 2 often have their own unique message types for the same thing, for example a Level 1 LSP and a Level 2 LSP, or a Level 1 Hello and a Level 2 Hello. In practice the two messages are identical, apart from the fact that one is marked as a Level 1 message, and one is marked as a Level 2 message.)

You’ve learned about one of them already: the LSP itself. Now let’s look at the three others.

One thing you’ll notice about all of these messages is that they’re not destined to an IP address! Instead, there’s a MAC address, and then we go straight into the IS-IS PDU. In other words, IS-IS operates at layer 2, not at layer 3. Some people argue that this actually makes IS-IS more secure than OSPF, because you can’t remotely flood another router with IS-IS messages. You have to be directly connected to another device in order to send it IS-IS messages.

It also means you don’t need to allow IS-IS packets through on any control plane firewall filters, because those will operate at layer 3. If you don’t want an interface to talk IS-IS, all you have to do is…. nothing! Just don’t turn “family iso” on, and the router will reject all IS-IS messages on that interface.

 

HELLO MESSAGES

Just like OSPF, IS-IS sends out Hello messages. These messages are used to initially discover neighbors, and then to keep the adjacency alive once it’s up.

IS-IS Hello Messages are sent out to a multicast MAC address every 9 seconds by default in Junos. Other vendors may do it differently. Fun fact: these messages are also called IIH, or IS-IS Hellos.

In an IIH header you’ll see the System ID of the sender, and also the Holding Time (what OSPF calls the dead interval), which is how long to wait before declaring the neighborship dead. Can you see those elements in this screenshot? They’re both in the header of the Hello message.

There’s three kinds of Hello: Level 1 LAN, Level 2 LAN, and point-to-point. This is what I mean about there technically being more messages than OSPF, but only because there’s different kinds of Level 1 and Level 2.

Interestingly, if a link is point-to-point, then it’s possible for the one Hello message to represent both Level 1 and Level 2! They use a special type of “Level 3” to mean both levels. I don’t know why this is exclusive to point-to-point links, but there we are.

This particular Hello is Level 2 LAN. We see “Level 2 only” in the Circuit type field, but how can I tell that this is a LAN hello? There’s a key giveaway: this Hello message also tells us what it believes is the Designated IS on the LAN, and what the router’s DIS priority is. You wouldn’t get this on a point-to-point link. We’ll talk about the DIS in a moment, and how it compares to the OSPF DR.

The Hello message contains some sweet TLVs, like the “Protocols Supported” TLV which shows that the router supports IP. You also see the Area Address TLV, the IP Interface Address TLV, and a whole lot of padding.

The padding is interesting. Just like OSPF, the MTU has to match in IS-IS. However, rather than advertising an explicit MTU, IS-IS pads the Hello message up to the size of the MTU. If the receiving router has a smaller MTU, it won’t be able to accept the Hello message. IS-IS requires an MTU of 1492 at a minimum, and the Hello is padded accordingly. There’s no need to pad any other messages, because if the router can’t accept the Hello then the adjacency won’t form in the first place.

Not many things need to match for the adjacency to come up. If it’s a Level 1 Hello then both routers need to have a matching area. They MTU needs to match. Interestingly, the Holding Timer doesn’t need to match; if they’re different then each router just honour’s the other router’s request. This is a great example of how IS-IS brings more flexibility. Some vendors give you configuration options which will force some of these things to match, but in practice you probably don’t need to do that.

 

COMPLETE SEQUENCE NUMBER PDU (CSNP)

When two routers – sorry, Intermediate Systems! – become neighbors – sorry, become adjacent! – they swap their knowledge of the network. They COULD do this by flooding all the LSPs they know about to each other, but that would be pretty wasteful if it turned out that a router knew most, or even all, of the LSPs already.

As such, they don’t send the full contents of the IS-IS database. Instead, they just send a summary, in a message called a CSNP, or Complete Sequence Number PDU.

As you can see on the right, a CSNP contains a list of all the LSP IDs and Sequence Numbers that a router knows about. You saw both of these numbers in the LSP packet capture earlier. You can see from the Source-ID that this CSNP message is from Router 4, and it contains three entries. Notice here that the router 4444.4444.4444 appears twice in the CSNP. That’s because there’s one LSP for the router itself, and one for the router acting as the Designated Intermediate System. Again, we’ll come back to that soon.

You can see here that a CSNP is literally just a list of LSP IDs, and doesn’t contain any prefix information itself. Think of a CSNP like an index, or the contents page in a book. If you don’t know what a book is: a book is a bit like a movie, except that some trees had to die for it to be made. And if you don’t know what a tree is, think of a tree as like a kind of haunted granddad that sucks moisture from the ground.

 

PARTIAL SEQUENCE NUMBER PDU (PSNP)

Once the routers have swapped CSNPs, they can check to see if there are any missing in their own databases. They compare their own list of known LSPs against the CSNP received from their neighbor. If a router sees an LSP is missing, or if it sees that a neighbor has a newer version, it requests it with a PSNP – the P standing for Partial.

I didn’t do a screenshot of this, because I want to encourage you to go here and look at Jeremy’s P2P capture. It’s very simple though: it’s just a smaller version of the CSNP, with a list of LSP-IDs and LSP Sequence Numbers. The router can then reply by sending the full LSP that the LSP-ID represents.

Once again, all of that seems far less complicated than the OSPF equivalent.

 

SOME QUICK BULLET POINTS ABOUT ADJACENCIES

  • Cisco uses Hello/Hold timers of 10/30 seconds for routers, and 3/10 seconds for the DIS. Juniper uses 9/27 seconds for routers, and 3/9 for the DIS.
  • Unlike OSPF, IS-IS transitions to a full adjacency as soon as hellos are exchanged – NOT when the LSPs are exchanged. This means that if both routers see each other in their Hello messages, the adjacency is up.
  • There are six adjacency stages: New > One-Way > Initialising > Up > Down (when there’s an area mismatch, hold time expired, authentication failure) > Reject (also for when authentication doesn’t match – it cycles between Down and Reject).
  • Although there’s six adjacency states, really you only need to know about Initialising, Up, and Down.
  • The CSNP is re-sent periodically on broadcast networks, by the DIS.

Wait, what’s a DIS? Good question!

 

DESIGNATED INTERMEDIATE SYSTEM (DIS)

On a broadcast network, OSPF elects a Designated Router. This achieves two things.

The first is to make the topology more simple to understand. If you’re not sure why this is, I’ll explain it in a moment when I introduce you to the concept of a pseudonode.

The second is to handle advertising LSAs. In a network of ten routers on a shared LAN segment, all OSPF devices will only become “fully adjacent” with the DR, not with each other. They then send their LSAs to the DR via a special DR-only multicast address, at which point the DR then re-advertises them to all the routers in the network. This is actually kind of inefficient in a way, because everything has to be sent twice, but it gets the job done.

(This is part of the reason that non-DR routers on the LAN don’t become fully adjacent with each other. If they did, they would end up exchanging LSAs with each other. By staying in the “2way” state, they never actually exchange LSAs with each other.)

This is also why a backup DR is elected. Non-DR routers send their LSAs to a multicast DR-only address, and if there’s only one device receiving them then it would cause a bit of chaos when a new DR is elected, because everything would need to be re-advertised to the new DR, and then re-re-advertised back to all the non-DR routers. As such, a backup DR is used. Both the DR and the BDR will become fully adjacent with all non-DR routers, and listen in on all the LSAs. If the DR disappears, the backup DR can immediately take over, because it already has all of the LSAs, and is already fully adjacent with all the other routers.

In IS-IS the DR is called the Designated Intermediate System – which makes sense, because Intermediate System is the ISO name for a router!

The mechanics are a bit different. In IS-IS, a router multicasts its LSPs directly to all the routers on the LAN. They don’t need to be re-advertised by the DIS. Way more efficient! Instead, the DIS just sends out a CSNP, a Complete Sequence Number PDU, every 10 seconds so that all routers can be sure that they do indeed have the latest and greatest information.

For that reason, there’s no need for a backup DIS. All routers take care of sending their own information to the LAN.

Just like in OSPF, each router has a priority which is used to elect the DIS. You can see this priority number in the IS-IS Hello packet capture above. The default in Junos is 64 – and just like OSPF, the numerically highest wins. Unlike OSPF, setting a priority to 0 doesn’t stop a router from becoming the DIS – it just means it’s not very likely. Also unlike OSPF, if a router comes online with a higher priority, it actually does take over the responsibility of being the DIS.

If two or more routers have the same priority, the highest MAC address used as a tie-breaker. To be more specific, the SNPA (Sub-Network Point of Attachment) is the tie-breaker, which includes both MAC addresses and DLCIs on frame-relay circuits, in case you’re from the past.

Changing an interface’s priority is nice and easy. Notice below that it’s done on a per-level basis.

set protocols isis interface ge-0/0/0.0 level 1 priority 100

 

PSEUDONODES

Both the DR and DIS has another important function, which is to make the topology more simple for SPF to deal with.

Check out this pic, taken from the old JNCIA guide, which you can and should buy and read!

Notice on the left that there’s four routers on a shared segment, each with an adjacency to each other. The adjacency itself doesn’t matter, that’s easy to deal with. Where things become a bit more tricky is when a router needs to run SPF, because the more topological connections there are between routers on a shared segment, the more complicated it becomes for SPF to do its thing.

In fairness, in a network of just four routers it’s not such a big deal – but every router you add makes this exponentially more complicated for SPF to deal with, because it has to process so many more potential paths.

To make this more simple, instead of each router seeming to connect directly to every other router on the shared segment, the DIS and the DR do something clever. Notice on the right that there’s some kind of invisible ghost router. All four real routers are showing as being connected only to this new pretend router that we made up. There’s a name for this haunted ghost router: it’s called the “pseudonode“, and it’s the role of the DR and DIS to generate it.

It’s weird that OSPF does exactly the same thing, and yet it has no official name for this concept. As such, you’ll often hear IS-IS folks using the term “pseudonode” to describe it in OSPF too.

You may remember earlier when you saw a screenshot of a PDU, that Router 4 appeared to be adjacent to “4444.4444.4444.01”. This was R4 saying that it is topologically connected to the pseudonode. All routers on the LAN will report a connection to this made-up router, and as a result the network map is made far more simple.

To quote from the JNCIA guide, “The pseudonode will advertise the neighbor relationships of all routers in its database update; the actual routers advertise a relationship with only the pseudonode”. This takes the strain out of the SPF calculation, because there’s less adjacencies to compute.

For this reason, you’ll see the router that acts as the DIS twice in the database – once for the actual physical router, and then once for the virtual DIS.

One final thing: this pseudonode doesn’t add anything to the metric (the “cost”) between two paths, because all links that are “coming out” of the DIS and DR have a cost of zero. And hey: that’s a great jumping off point to talk about IS-IS metrics!

 

METRICS & REFERENCE BANDWIDTH

Like OSPF, IS-IS uses the Shortest Path First algorithm to work out the best route to a prefix.

OSPF’s “cost” is based on the bandwidth of the link. Weirdly, by default IS-IS doesn’t care about that – it just gives every link a cost of 10, regardless of the speed! (The one exception is loopback interfaces, which get a metric of 0.) Remember that all of these protocols were made a long time ago, when network requirements were very different.

Don’t worry though: it’s very easy to change this.

First, you can increase or decrease the cost on a per-interface basis. You can even have different metrics for each level:

set protocols isis interface ge-0/0/0 level 1 metric 50
set protocols isis interface ge-0/0/0 level 2 metric 40

A much better idea though is to base the cost on the bandwidth of the link. Just set your “reference-bandwidth” of choice, and you’re “good” to “go”. Nowadays you have to do this in OSPF too, because all links of 100Mb and above all have a cost of 1, so in practice this is really no difference in setting up either protocol in the year 2021. Here’s how you do it:

set protocols isis reference-bandwidth 100g

At the time of writing, 100g is the biggest number you can choose in Junos, though I’m sure this will change in the future, when 100000000g links are the norm.

Hey, here’s a fun fact that you’ll never need to know in the real world, but is interesting trivia: there’s actually four costs in an IS-IS TLV. As well as the “Default Metric”, IS-IS also allows you to calculate a path based on the Delay Metric, the Error Metric, and the Expense Metric, which is literally how much money the link costs!

Having said that, even though these numbers are included in the advertisement (as you can see in this picture), these metrics are always set to 0, and no vendor that I know of actually uses them. Still, it tickles me that IS-IS has four costs, including one actual monetary “cost”!

One final thing. See in that screenshot, the first three lines of the IPv4 prefix are to do with the “Default Metric”. The first line is 10, which is the metric itself. The second line is whether this is internal or external. If this had been redistributed into IS-IS, this bit would be set, so you can tell if a prefix is internal or external – though you’ll see in a moment that this isn’t always true.  Finally there’s something called the “Down” bit. Hold fire on that for a moment, because we’ll talk about that in Part 4, when we verify all of this on the CLI.

 

WIDE METRICS

By default, the maximum cost a link can have is 63, because the three original prefix and topology TLVs (IS reachability, IP Internal Reachability, and IP External Reachability) only gave 6 bits to the metric value – as you can see in the pic above of the different “metrics”.

Again, remember that all these protocols were invented a long time ago, when there were pretty much only four routers in the entire world. The idea that networks could have grown to the size and speed of today was the stuff of dreams. Luckily, the gods of the internet made some extensions that support a 24-bit metric field. These two TLVs are called the Extended Reachability TLV, and the IP Extended Reachability TLV. When you use these, it’s called using the Wide Metric, and is configured like this:

set protocols isis level 2 wide-metrics-only

Interestingly, Junos advertises both by default, for backwards compatibility purposes. But even though it advertises both out of the box, it only uses the small metrics by default. So remember to configure the wide metrics if you want to use them, which you definitely do!

A couple of other things to note when working out the best route:

  • Level 1 paths are preferred over Level 2 paths. In other words, if a router has learned a prefix via an L1 router, and it is also learning it via the backbone, it is definitely going to prefer the route within the non-backbone area, because this will be where the prefix originally came from.
  • Internal paths are preferred over external paths. In other words, if an IP is being learned via IS-IS directly, and if that same IP has been redistributed into IS-IS from another protocol, IS-IS prefers its own route.
  • When you turn on Wide Metrics, the extended TLVs don’t contain the “external” flag – in other words, IS-IS doesn’t distinguish between internal and external prefixes when you’re using wide metrics.

 

MESH GROUPS

This final bit is something you probably won’t use in the real world. Then again, most real-world networks look absolutely mad, and nothing like the best practices you read about in books, so actually there is a chance you might need it after all!

Imagine a full mesh of 10 routers. In other words, each router has nine IS-IS adjacencies.

When a network has a full mesh of routers, LSPs sent by one router will be received by all other routers – and then flooded to all other routers, which means that routers receive the same LSP multiple times. Not a big deal in a small network, but a real waste of resources in a big network.

You can make this more efficient by configuring something called a mesh group. It’s super simple: you just tell the router which interfaces are in the mesh group. Then, when an LSP comes in on an interface in the mesh group, it won’t be flooded out of any other interfaces in the mesh. Observe:

set protocols isis interface ge-0/0/0 mesh-group 1
set protocols isis interface ge-0/0/1 mesh-group 1
set protocols isis interface ge-0/0/2 mesh-group 1

The number is a 32 bit value, with only local significance. Boom! Easy. Then again though, in the year 2021 the bandwidth and CPU required to send and process LSPs really isn’t a big deal, so arguably this configuration introduces more complexity in exchange for very little benefit. Have a think about the benefits and trade-offs, and decide whether mesh groups are right for you.

 

THAT’S IT FOR NOW!

That’s it for part 2. How are you finding it? I hope these posts are useful to you. I’ve got to say, when I started studying IS-IS I thought I’d hate it, based purely on the “weird” addresses that I didn’t understand. But then, as I started studying it, I’ve ended up preferring it over OSPF any day of the week.

Go grab a cup of coffee, then come back for part 3, where we’ll go back to the concept of levels, and finally expand on the concept of the “Area ID”. Click here for part 3!

If this post was helpful to you, I’m always grateful if you share it on social media, Twitter, LinkedIn etc. It’s always fun when more people get to hear about my posts, and I’m grateful for you sharing it. If you’re on Twitter then follow me to find out when I make new posts. And hey – why not leave a comment saying hi! Who knows: perhaps we’ll even end up running off together and starting a new life in Hawaii.

2 thoughts on “JUNOS: IS-IS STUDY NOTES, PART 2 – FOR JUNIPER’S JNCIS-SP and JNCIS-ENT EXAMS

  • January 29, 2019 at 10:56 am
    Permalink

    Very useful info . Thank you

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *