Upgrade Your Data Center With Confidence For Just $6,150

Many people we’re speaking with are going through data center upgrades – replacement of network equipment, upgrade of cabling and connection speeds, consolidation, expansion, you name it. This is something that can go very smoothly, or very wrong.

In our conversations with our customers, we’ve also discovered that one of the primary uses for indeni is validating that the new network equipment is correctly configured before it goes live. Think of it this way – you are setting up a new pair of routers, or firewalls, or load balancers, and you want to make sure that they are correctly configured before you have production traffic pass through them.

Who wants to start a maintenance window on Saturday at 2AM only to roll everything back after discovering some configuration mistake?

SOO, we’ve decided to come out with a special package for you: by spending as little as $6,150 you will get the ability to test the configuration of your devices before they go live. This package includes the full blown indeni for 10-devices, for 6-months, with the ability to move indeni’s spotlight between devices. This means that every week you can point indeni at another set of devices, those that are due for deployment at the end of the week. During the week, indeni will keep an eye on those devices and tell you before the weekend if anything looks wrong. Then, you take those devices live and let indeni keep watching them until the end of Monday. If all goes well, you move to another set of devices on Tuesday. Simple, eh?

This package supports Check Point firewalls, Cisco routers and switches running IOS 12 and up, Fortinet Fortigates running 4 and up, Juniper SRXs, SSGs and ISGs and F5 BIG-IP 11.x.x.

How to know if Internet of Things will break your network

Photo copyright: carmendorin / 123RF Stock Photo

Yesterday I spent some time with a new customer of ours, a Fortune 500 communications company. They are essentially the network over which a variety of companies send their data across. These companies can be in manufacturing, shipping, utilities, retail or anything else you can imagine. As we were talking, they mentioned a major project they are undergoing – a complete replacement of their networking gear from core to edge. Three years, tens of millions of dollars, requires CEO sign-off.

“Why are you doing that?”, I asked. “Surely, that is a big and expensive project with many risks associated with it.”. The answer was one I’ve heard several times over the past few months: “We’re rolling out IoT and digitizing many of our operations and there’s no way our current network infrastructure can deliver on the bandwidth and reliability requirements put on us by IoT.”. While the bandwidth point is apparently a clear one, the reliability is one that was only made clear to me for the first time by another company I spoke with a few weeks ago: connected “things” send their data in a way similar to UDP packets. That means that if you miss a packet, it’s lost. No option for retransmission. No way to recover it. Its gone. The data will never come back.

That’s a very important point – it’s not only about bandwidth but more importantly about reliability. And we all know that while bandwidth can be obtained by adding more connections and bigger hardware, reliability is a whole other story. You need to up your game, run things with the accuracy they do at an Intel factory.

The litmus test for networks pre-IoT

So, if you’re on the networking or IT side of the house and the CEO is talking about digitizing the entire business, you should be worried if:

1. Services that are being digitized are critical to the company’s top and bottom lines.
2. You are relying on two-decades old, SNMP-based, network monitoring technology (like HP NNM, CA Spectrum, IBM Tivoli, SolarWinds Orion, etc.)
3. You have a shortage of networking talent or the talent you have isn’t experienced enough to run a network of this scale.

If you’ve answered YES to the above, it’s time to sit down and think long and hard what you’re going to do. Chances are, your current network won’t deliver and you’ll be blamed for slowing down the company’s business progress.

ARP table is approaching its limits: Check Point Firewalls Configuration Alert Guide

This is a real life sample alert from the indeni configuration alert guide for Check Point Firewalls.

Description:

The device’s ARP cache is approaching its limit. Currently, there are 2046 entries in the ARP cache, while the limit is 2048 (99.0% is in use). The device is approaching a situation where some ARP entries will not be entered into the ARP table or some entries will be removed prematurely. Network connectivity will be affected.

To learn more about what is causing this, read about ARP Neighbour Overflow on blog.lachmann.org.

indeni will re-check this alert every 1 minute. If indeni determines the issue has been resolved, it will automatically be flagged as such.

Manual Remediation Steps:

Identify the cause of the large ARP cache. If it is due to a legitimate cause, such as a high number of hosts visible on the available networks, you should double the values of each of the following sysctl parameters:
net.ipv4.neigh.default.gc_thresh1
net.ipv4.neigh.default.gc_thresh2
net.ipv4.neigh.default.gc_thresh3
This can be done by updating /etc/sysctl.conf with the new values (the old ones are accessible by executing cat /proc/sys/net/ipv4/neigh/default/gc_thresh*) and running sysctl -p.

For more information please read SK43772.

How does this alert work?

indeni continuously monitors the values of the gc_thresholds mentioned above as well as the size of the ARP table (counting the number of entries resulting from “arp -an”). If the number of ARP entries is at least 80% of the total number of entries allowed, indeni alerts. indeni does NOT rely on the log message “Neighbor table overflow” as that tends to be too late – traffic is already lost at that point.

The “How to avoid in the future” section of a Root Cause Analysis report – any use?

You had a major network outage (like Time Warner just did). Panic, stress, sweat, people trying all kinds of crazy things. In the end, the issue is resolved and the outage is behind us. Then, comes the really fun part: doing a Root Cause Analysis (RCA).

There are a ton of templates for this, such as this one and this one. In each one, at the very end, is a section that details “how to we make sure this doesn’t happen again”. Sadly, though, in most cases, this section describes how processes will be changed, check lists will be made and extra peer reviews will be conducted. Frankly, our experience shows this rarely actually works.

Image: antkevyv / 123RF Stock Photo

Our goal is to change the way this is done. If someone were to spend the time to read every RCA ever written about a network outage and build a system that implements the recommendations detailed in that last section of the document, then issues would indeed be avoided.

We, indeni, are that someone. Of course reading RCAs manually is a bit difficult so we’ve devised more automatic ways of collecting this knowledge. With indeni, users have less RCAs to write. One of our larger customers actually told us we’ve reduced the number of RCAs they have per quarter by 93%!

So, if you’ve recently run into an annoying issue with Cisco routers and switches, or Check Point firewalls, or F5 load balancers, or anything else we support – give us a try. It only takes 45 minutes.

Announcing indeni 5.0: Trending capabilities, easier UI navigation, better performance, tons of additions

We’re excited to announce the release of version 5.0! After being used for quite a while by some of our customers, we’re ready to have the whole world enjoy what 5.0 brings.

IMPORTANT: 5.0 includes major changes to the underlying infrastructure of indeni’s engine. As a result, upgrades are done via a complete re-installation of the indeni OS and application. The upgrade maintains all of the existing monitoring definitions and alerts. Please contact indeni’s support to conduct the upgrade together.

New features:

  • The Analysis tab has been added, providing the ability to visually track critical metrics over time. These metrics are compared to the alerts issued (those orange bubbles at the bottom of the graph in the above screenshot).
  • Tabs have been re-organized in the web console to better fit our users’ task oriented activities (see more information below).
  • indeni Insight can now be configured from the web console.

New Infrastructure:

  • indeni 5 introduces the use of a new type of database in order to support the collection of data for the Analysis tab. This data includes: CPU, memory, disk space, connections and various NIC error statistics.
  • Following Lessons Learned from indeni 4 the SSH infrastructure was replaced in order to support better error handling related to connecting and communicating with the devices under monitoring.

New product versions supported:

  • IK-1675: Support CP R77.20
  • IK-951: Support FortiOS 5.0.1

NOTE: Customers who require support of a given product version prior to the main release can contact support@indeni.com and a development build will be provided.

New signatures:

  • IK-1677: Alert for Check Point device running with a Trial License
  • IK-1376: Inform when Dynamic routing protocol state changes
  • IK-1365: Notify when OSFF Topology Tree is rebuilt
  • IK-1316: evice profile – Alert if there are 2 different VTP domains in the same LAN

Changes to the Web Console

Following the feedback from our customers we have made some minor changes to the tab assignments in the Web Console in order to make it more intuitive:

The Operations Management tab incorporates the previous Home and Monitoring tabs.

  • Analysis holds the new trend graphs pane displayed above.
  • Home Dashboard is now available in the Network Health pane.
  • Signatures are now presented under Knowledge Management.

The Compliance Management tab has replaced the Device Config Management tab.

  • Configuration Checks replaces the Device Profiles pane.
  • Configuration Journal replaces Change Tracking.
  • Configuration Check Reports replaces Device Profile Compliance.

The Tools tab is a new one:

  • The Live Debug feature has been deprecated in favor of focusing on other functionality of indeni.
  • Live Configuration replaces the previous Actual Configuration pane (previously under Device Config Management).
  • Search replaces the Device Explorer pane.

The Reporting tab now holds the Inventory Report (previously under Device Config Management).

The ”indeni insight” tab was added to support enabling the new indeni Insight service (read more).

 

Bugs fixed and minor improvements:

  • Numerous performance and usability enhancements in the web console. Including:
    • WC-1996: speed up multiple Alert Acknowledge
    • WC-1642: eliminate errors caused by device deletion
    • WC-1691: Ability to select and act on multiple alerts across multiple pages
  • WC-2004: Add more details to “ClusterXL member is in a critical state”
  • WC-1969: Reports | enhance behavior of item deletion
  • WC-1962: Resolve memory leak in FF30
  • WC-1950: Improve stability of “By Management” view
  • WC-1945: Network Health | smoothen zoom behavior
  • WC-1942: “Knowledge management” subtab loading speed improvement
  • WC-1935: Signatures – columns sizes correction
  • WC-1923: Usability – increase length of search field
  • WC-1900: Improve organization of options under “Resolve” button
  • WC-1878: Make Ignored Items show in the Device’s Alert configuration
  • WC-1817: Add an ability to select more than one interface for P1 MDS or CMA
  • WC-1404: Display default thresholds for signatures
  • WC-1403: Add an asterisk (*) symbol next to every signature with changed thresholds
  • IS-1019: Tools-Troubleshooting – add “cpstat os -f sensors”
  • IS-1002: Actual Configuration – ClusterXL Mode is “unknown” in some cases
  • IS-980: Decrease occurences of alert flapping
  • IS-953: Actual Configuration performance improvement
  • IS-914: Improve the responseof the “stop monitoring” feature
  • IS-887: Make identification of a device upgrade quicker
  • IS-876: Reduce time it takes to shut down indeni
  • IS-866: Move SSH communication layer to Apache SSHD from Ganymed
  • IS-826: Improve speed of leading actual configuration
  • IS-780: Change Tracking accuracy improvements
  • IS-742: Add Alert Severity to the Subject of alert e-mails
  • IS-618: Expose API to fetch measurements + history
  • IK-1690: Route overlap identified – don’t alert when next-hop is the same
  • IK-1689: Improve accuracy of Check Point ClusterXL sync-related alerts
  • IK-1688: SA#24915 alerts (e.g. packet errors) should contain the total number of packets that we compare against
  • IK-1674: “A NIC has failed recently (SA#24915)” include concise log file data
  • IK-1672: Reduce sensitivity of “Errors have been found in packets received by NIC (SA#24915)”
  • IK-1671: “EIGRP unidirectional link identified” – improve accuracy
  • IK-1633: Cisco – improve troubleshooting when there is an issue with the privileged mode password
  • IK-1564: Improve discovery of Fortigates using banners
  • IK-1541: Reduce sensitivity of the “Proxy ARP Enabled” alert
  • IK-1250: Loopback alert – do not alert in case there is a management port
  • IK-1112: Add “vsx stat -l” and “vsx stat -v” to the debug report for VSX devices

Check Point Certificate(s) expired or about to expire

This is a real life sample alert from indeni

Description:

Some of the certificates configured on this management server have expired or are about to expire.

indeni will re-check this alert every 1 minute. If indeni determines the issue has been resolved, it will automatically be flagged as such.

Expiring/Expired Certificates:

  • Certificate with DN cn=john doe,ou=standard users,ou=users,ou=us,dc=us,dc=mycompany,dc=com with expiration date of Aug 06 11:48:39 2014 EST

This is a user certificate. An expired user certificate means that the user will not be able to log in.

  • Certificate with DN cn=jcnj-fw1 vpn certificate,o=northamerica.mycompany.com.hnuj7k with expiration date of Sep 29 17:55:29 2014 EST.

This is a firewall VPN certificate. An expired VPN certificate may mean a VPN tunnel going down.

Manual Remediation Steps:

Review the list of certificates and act according to the certificate type described above.

For VPN certificates, review SK61087.
For user certificates, review the Remote Access VPN documentation.

How does this alert work?

When indeni connects to Check Point CMAs or Domains (under MDM) it will automatically pull the certificates in use and track their expiration dates. The list of certificates is refreshed on an hourly basis.

F5® Alert of the Week: Pool member response time too high

This is a real life sample alert from indeni

Description:

The ping time for pool member A (10.10.15.52) in the IIS_server pool is too high compared to other pool members. It is currently 512ms whereas the average (excluding this member) for the pool is 20ms.

indeni will re-check this alert every 1 minute. If indeni determines the issue has been resolved, it will automatically be flagged as such.

Manual Remediation Steps:

Review the network connectivity to the pool member as well as possible issues on the host itself.

For more information on how to set up pools, refer to Configuring Load Balancing Pools in the F5 user guide.

How does this alert work?

indeni loads the configuration for all pools and pool members on each LTM® and pings them once in a while. If one of the members exhibits a higher response time that the others the alert is triggered.

 

Top 5 Check Point experts announced!

Thank you all for the amazing response to our effort to put the spotlight on the world’s top Check Point experts (read our original post). It was impressive to see people handing kudos to one another, sometimes from one side of the globe to the other.

There is an incredible large group of 76,207 professionals who have some sort of experience with Check Point’s products (according to LinkedIn). Here are the top 5 according to your votes:

And here is a textual version of the above:

Tobias Lachmann (LinkedIn), Data Center IT Architect, akquinet outsourcing gem. GMBH
CPShared handle: tobbela
Web: http://blog.lachmann.org

Valeri Loukine (LinkedIn), Senior Security Consultant, Dimension Data SA
CPShared handle: varera
Web: http://checkpoint-master-architect.blogspot.com

Patrick Waters (LinkedIn), Co-Founder and Security Architect, Bridgian
CPShared handle: fireverse
Web: http://www.fireverse.org

Whatcha McCallum (LinkedIn), US Escalations Group Manager, Check Point Software Technologies
CPShared handle: whatchamccallum
Web: http://forums.checkpoint.com/forums

Devin Siwak (LinkedIn), Chief Technology Officer & Co-Founder, Vcura Incorporated
Web: http://www.vcura.com

 

Check Point hosts file corrupted or missing entries

This is a real life sample alert from indeni

Description:

The operating system could not find the IP address associated with “localhost”. This normally means the hosts file has an error that may result in problems with certain services.

indeni will re-check this alert every 1 minute. If indeni determines the issue has been resolved, it will automatically be flagged as such.

Manual Remediation Steps:

Review the hosts file for any missing entries. Specifically, look for the “127.0.0.1 localhost” entry. On Unix-based operating systems (Linux, FreeBSD, SecurePlatform, IPSO, etc.) it will be “/etc/hosts”. On Windows-based operating systems it should be “c:WindowsSystem32driversetchosts”.

For more information on the importance of the hosts file see SK42952 and CCMA’s blog post.

How does this alert work?

indeni tracks the structure of the hosts file as well as tests to ensure the required host entries are present. In some cases, the test also involves using nslookup and ping commands to ensure the host name is resolved correctly.

F5 Provisioning settings unequal across device group

This is a real life sample alert from indeni

Description:

This device has provisioning settings that differ from other devices it is syncing with. To ensure optimal operation these must be the same. The current provisioning settings on this device are:
* Local Traffic (LTM®) – Nominal
* WAN Optimization (WOM) – Nominal

indeni will re-check this alert every 1 minute. If indeni determines the issue has been resolved, it will automatically be flagged as such.

Mismatching Devices:

jcnc-adc2 (10.10.145.3)

* Local Traffic (LTM®) – Nominal
* WAN Optimization (WOM) – Dedicated

Manual Remediation Steps:

Review the provisioning settings and ensure they match.

For more information, refer to SOL13946.

How does this alert work?

While monitoring devices, indeni automatically determines how they are related. In this example, indeni determined two devices are part of a device group and are set to sync. As such, indeni compares configurations that are NOT synced to ensure they match – such as provisioning settings, licensing settings, hardware type, etc.