Mobility Field Day 2

The excitement for MFD2 has been building for awhile now and I can tell you we are in store for a great couple of days.  I have been looking forward to this event since I accepted the invitation to be a delegate a couple of months ago.  After coming together for dinner tonight as our first time as a group, I am even more excited for all that is in store.

Here is the lineup:

Mist Systems – July 25, 2017 at 10:00 – 12:00 PST

I have had a chance to play around with one of Mist’s APs over the past couple weeks and I found it to be very intuitive and easy to set up.  It took no time at all to get an SSID on the air and clients connected.  The available data on the dashboard is very helpful/useful.  I am looking forward to getting a deeper dive into the BLE features of the device.

Mojo Networks – July 25, 2017 at 2:00 – 4:00 PST

Mojo Networks, formerly AirTight, has gone through several changes over the last couple years.  They have changed their direction and I am looking forward to hearing more about their plans going forward.  They had an excellent presentation at WirelessLAN Professionals conference this past February.

Cape Networks – July 26, 2017 at 9:00 – 10:00 PST

Cape is a company that I am not real familiar with yet and I am very interested to see what they have lined up.  I did see their presentation at WLPC but haven’t had a chance to work with their gear yet.

Nyansa – July 26, 2017 at 10:30 – 12:30

Another company that I am really looking forward  to hearing more from is Nyansa.  From what I have seen, they are able to take wireless client data and present it to an engineer in an easy to read format to get a better understanding of what is actually going on in the wireless network.  You can find their presentation from WLPC here.

Netscout – July 26, 2017 at 2:00 – 4:00 PST

The AirCheck G2 is our go to tool in the field when a wireless problem is reported at work.  I am excited to hear what they have in the pipeline for the future.  I am also pumped that we get to see our good friend Kendall Hershey.  A couple of us had the pleasure of meeting and hanging out with her at Cisco Live 2016.  Her and I both missed this year’s Cisco Live so it will be cool to see her again.

Being able to watch and listen to presenters at previous Wireless/Mobility Fields days and interact with the delegates was something I looking forward to.  With this being my first MFD, I would be lying if I said I wasn’t a bit nervous but I am with a great group of professionals so I am sure it will go great!  I am very excited for the next two days as they will be packed from the beginning of the day until the end with all kinds of great information and maybe some surprises.

Please feel free to follow along on the Mobility Field Day page here.

Please also feel free to interact with me or any of the other delegates listed in the above link.  Interacting with the delegates was almost as fun as listening to the presenters in my past experiences.  If you have any questions you would like to see asked of the presenters, feel free to reach out to any of us delegates via Twitter.  Make sure to also use the hashtag #MFD2.

In the weeks to come I will be releasing blog posts about the presentations so make sure to check those out when they are released.

 

 

#WLPC 2017 PHX, #SingleChannelAdventures, and Looking Back

It has been a couple weeks since I returned from WLPC in Phoenix.  It was a great trip down to the southwest which included catching up with some good friends, listening to great presentations, learning a lot, and also presenting on a topic near and dear to my heart.  I was able to put a lot more names to faces and have some great conversation.

As many of you know I presented on how well single channel architecture works for us.  You can find the video below:

I have received mostly all positive feedback on the presentation.  It was a great opportunity for me to speak in front of a large crowd and give examples of how we have Meru deployed across our school district.  As well as it went, I feel like I need to explain a few things.  I have come to the realization that a Ten Talk may have been the wrong platform to discuss the topic at hand.  I understand that most people wanted more information about SCA rather than “it just works for us.”  Ten minutes was simply not enough time to discuss all of this.

First, I understand that being a large network doesn’t equal wired\wireless networking done right.  Even though this wasn’t conveyed in the presentation, I have received feedback indicating as such.  Although we are very proud of how large and successful our network is, it doesn’t mean that large = successful.  A lot of time and effort goes into making a wireless network work well for 60,000 users on average every day.  We all know that architecture doesn’t matter if you don’t define, design, implement, and validate.

Second, I gave some confusing information.  I realized shortly after the presentation that the stat showing that we “average ~12 connections per AP county wide” was very vague.  Some of our access points have upwards of 100 clients on them at once for an extended period of time.  Some of our access points have 10 clients total (maybe less) in an entire day.  It doesn’t matter whether there is 30+ clients on an AP during every class or another AP that sits unused for the majority of the day.  If an AP was placed in a location, it was done there with the intent that wifi could be needed at any point of an instructional day; planned or un-planned.  The opinion that there are too many APs or not is irrelevant.  Unless you have actually visited our schools and used the network, you won’t know.  The proof is in the pudding so to say.

I know that most of us consider our wireless networks as mission critical.  The vast majority of our schools don’t have desktop computers other than in administrative areas.  Many of our schools don’t have computer labs, but employ several carts of mobile devices.  I know that most of us want NUMBERS to back up the user experience, but most of the time I don’t have time to get this type of information.  If reports of “wireless problems” are low or non-existent and teachers are able to complete their instruction using mobile devices and be successful, our mission is accomplished.  I know a lot of you won’t be happy until you see NUMBERS and that is fine.  I am going to try real hard to get some of those and put them out there.  To be perfectly honest, I don’t know if some of you will believe it then, and that is fine too.

I understand many of the things that have been said as far as physics, airtime consumption, high density, etc.  I don’t necessarily disagree with some of the opinions.  You may be absolutely right!  The thing that is somewhat discouraging is that there are a lot of opinions based on no experience or an experience that was several years ago.  Don’t get me wrong, the first time I saw virtual port, even with my limited wireless knowledge at the time, I couldn’t believe it would work.  A lot has been improved upon since then with virtual cell and especially the equipment.  I’m not saying Meru will beat another vendor head to head every time, but I bet it will sometimes!  Does it really matter how many jigabits we can ram through the air or is it more important for a user to have an experience where they don’t even notice the wifi?  A user sits down, opens their laptop, completes a task, closes the laptop and moves on without even acknowledging the presence of wifi.  The wifi just works.  Please understand, I am not trying to discount numbers such as channel utilization, retries, available bandwidth, etc.  Those are all very important things to consider in a wireless environment.  We all look to those numbers first when trouble is initially reported.  Those are the numbers that give us a baseline to begin troubleshooting.  They are absolutely critical to a successful wireless deployment.

Obviously interest has been piqued since the presentation.  It has been fun, most of the time, discussing various aspects concerning single channel vs multi channel environments.  I have heard a handful of different people give a handful of different explanations on what Meru’s special sauce is, and they are all different.  There isn’t a ton of information out there concerning the “magic” of Meru but if you are truly interested please watch any video by Dr. Bharghavan, founder of Meru networks.  Many of these videos are a few years old, but still offer great information.  To be honest, I need to re-watch most of them as I get tangled in the “it just works,” sometimes.  Below are a handful of videos.

If you want to catch the videos later but still want to read the conclusion, please scroll down.

Meru Networks Wireless Virtualization Architecture – Part 1

Meru Networks Wireless Virtualization Architecture – Part 2

Meru Networks Wireless Virtualization Architecture – Part 3

Contention Management Schemes: Part 1 – Single / Multiple AP

Contention Management Schemes: Part 2 – Multiple APs

Maximizing Air Traffic – Part 1: Maximize Channel Reuse

Maximizing Air Traffic – Part 2: Simultaneous Transmissions

Leveraging Single Channel Architecture for Multiple Channels

 

A Little Dated But Still Good

Very High Density Wireless LAN Demonstration for BYOD: #1

Very High Density Wireless LAN Demonstration for BYOD: #2

Conclusion

For those of us who went to WLPC 2017, we heard more than one person mention that wireless networking can be done in more than one way.  We also heard that less than ideal practices may be employed against our better wireless judgement due to other factors such as politics, aesthetics, etc.  Sometimes I think we need to remember that just because someone does something different doesn’t mean that it is wrong.  We also need to remember that just because we don’t like a technology it doesn’t mean it doesn’t fit someone’s need.  I need to remind myself of this from time to time.  We deploy wireless networks, in schools, mines, warehouses, large refrigerators, outdoors; you name it, we put wifi in all kinds of places.  Our ultimate goal should be to use the knowledge we have to deploy a wireless network that gives a reliable experience to the greatest number of users.

Frozen Fi – The Big TX Freeze

frozenfi

We all find bugs in software from time to time and most of the time they are fixed in relatively short order…..maybe….

It is always an unsettling feeling when those bugs resurface in later versions of code after being “fixed.”

We recently received reports from a school that the wifi was “not working.”  Early on in troubleshooting I could see that stations were associated to APs in the reported troublesome area.  All the normal quick checks looked ok.  There wasn’t high channel utilization, retries were low, the APs were not overwhelmed with clients, etc, etc.  I touched base with the person who reported the trouble and they informed me that the issue affected all devices and that it seemed to be in one specific area; the library.  I again checked the APs via the web ui and upon quick glance everything still looked ok.  I logged into the controller CLI and issued the station database command to see what the clients were doing specifically on the library access points.  And then it got cold……..

txfreeze
Station database showing the dreaded TX freeze

As you can see in the graphic above the TX packets are few or none at all.  This problem occurred a couple years ago in later versions of System Director 6 code and was fixed with a newer release so I had seen the problem before.  I took down all of the necessary information so that I could put a ticket in with TAC then rebooted the two APs in the library.  Both of the APs affected were AP 832s and both were experiencing the same problem.  All other APs (AP1020s) were working fine.  After the reboot the APs returned back to working order.

I began putting my documentation together to submit a ticket when we received word that another school was having wireless issues.  This school was having trouble in it’s library as well.  Like most of our schools, all high density areas (for the most part) are serviced by AP832s.  I logged into the CLI of that school’s controller and found the same TX freeze occurring.  It became obvious to me at this point that the problem was likely affecting more than these two schools.  I checked a few other controllers and found most AP832s were in a frozen TX state.  I started to think what the common denominators could be.  All the APs affected were AP832s.  All the controllers were running the same code, System Director 8.1-2-0.

And there it was….

I glanced at the uptime of the AP832s on the controllers that I had up.  All of the AP832s had been up for 99 days.  I started going through other controllers and checking AP832s that had been up for 99 days and found them all to be in a TX frozen state as well.  A few I found that hadn’t reached 99 days were still operating normally.  Clearly there was a bug that reared it’s head on the 99th day of uptime.

So you are probably asking yourself why so many of the APs had reached the 99 day uptime mark at the same time.  I was wondering the same thing at first and then it hit me.  I had done a mass upgrade of controllers over the summer to System Director 8.1-2-0 which mostly put all of the controller/AP uptimes in sync.

We launched a preemptive strike and rebooted all AP 832s less a group of three APs at one middle school which hadn’t reached 99 days yet.  We knew that if we rebooted all of the APs the issue would be resolved for another 98 days or until FortiRu provided us a fix.  All of the Ap832s returned to working order.

At that point I assembled all of my documentation and submitted a ticket to TAC.  We did some initial troubleshooting but they needed to have some APs in the frozen state to get the information they needed.  We put the ticket on hold until our test group of three APs reached 99 days uptime.  That occurred over this past weekend, November 20.  When I returned to the office I checked the APs which were at an uptime of 102 days.  All three were in a TX frozen state.

AP832s with uptime of 99 plus, in a frozen TX state
AP832s with uptime of 99 plus in a frozen TX state
As of right now, the issue is with TAC.  It looks like we are the first to report the issue.  They have all the information that they requested; logs, diags, etc so I should be hearing back from them soon.  In the meantime, a quick reboot is just the ticket to thaw out the TX.

Here are a few commands I used to gather AP/radio data from within the AP CLI:

stadb display assigned

stadbdisplayassignedstadb display assigned -v <mac-addr>

stadbdisplayassignedmacstadb display rxq_info <client mac-address>

stadbrxqstadb display txq_info <client mac-address>

stadbtxqsys exec /wl -i radio0 msglevel err

sys exec /wl -i radio1 msglevel err

radio txqinfo radio0 (radio zero)

radiotxqinforadio0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

radio txqinfo radio1

radiotxqinforadio1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

radio txq radio0

radiotxqradio0radio txq radio1

radiotxqradio1radio display

radiodisplay

 

 

 

radio show radio0 (radio zero) – radio specific parameters

radio show radio1 – radio specific parameters

radio stats radio0 (radio zero) – radio specific stats

radio stats radio1 – radio specific stats

dev cmd radio0 reset – resets radio without rebooting the AP

dev cmd radio1 reset – resets radio without rebooting the AP

sys exec cat  /proc/meminfo

sysexeccatprocmeminfo