Manual Creation of Log Rules Is So 1999

SolarWinds just came out with a contest called Rule Your Log Data. In this contest, they are encouraging the community to build rules using their log management tool (Log & Event Manager) and submit it for review.

SolarWinds, much like Splunk, Sumo Logic, Loggly and dozens of other log server providers, have built a great product for storing logs and analyzing them. You can easily query the logs, build rules and alerts and sift through the data when you need to.

The challenge is – you don’t have the time to do it. Every single engineer I’ve ever asked, says they do not actively watch the logs flowing into their log management system. Instead, they wait for an outage to occur, then they do the root cause analysis using the logs. Once the issue is found, they create a rule to alert when a similar log (or set of logs) occurs in the future.

That is not scalable:

  1. It takes too long to query the log database and build these rules.
  2. Sharing the rules you’ve created is sometimes really difficult (see an example for SolarWinds Log & Event Manager) and in all cases not straightforward.
  3. Most importantly – as a user – you’d want someone else to create those rules for you.

This last point drove us at indeni to build an automated mechanism for generating log analysis rules. indeni today starts with pulling the logs out of the devices we analyze (such as Check Point firewalls, Cisco routers, switches and firewalls, F5 load balancers and Palo Alto Networks firewall). Then, it compares those logs to the knowledge that exists online pertaining to what logs require special attention. For example, if there is a knowledge article on the manufacturer’s website describing a certain issue, indeni will use that information to determine if certain logs indicate the issue described in the article. Yes, you read that correctly – indeni automatically generates log analysis rules based on data available on the Internet.

So – if your network management system asks you to create rules for alerting about certain logs, you should think long and hard if that system is built for the challenges of 2015. Then, go to try.indeni.com and give indeni a spin. 

SNMP Errors: Go Beyond The Basics for SolarWinds© Orion© Users

network monitoring

 

Finding errors lurking in your network with a SMNP based monitoring tool soooooo 1999.

If you are a SolarWinds Orion user, you have probably realized that while NPM is an amazing product for SNMP-based monitoring, today’s challenges require a more in-depth solution. Out of the box, SolarWinds Orion NPM will give you these:

  • Network availability
  • Bandwidth capacity utilization
  • Buffer usage and errors
  • CPU and memory utilization
  • Interface errors and discards
  • Network latency
  • Node, interface, and volume status
  • Volume usage

(the list above was taken from NPM’s Quick Start Guide)

All of the items listed above have a few common aspects to them:

  1. They are simple numbers – bits, bytes, percentages.
  2. They are numbers compared to thresholds for alerting (e.g. “if CPU is above 80%, then alert”)
  3. They are generic to all networking devices.

Back in 1999, when NPM was originally written, this approach was great. Network devices were simple (frankly, most of them were made by one of three or four manufacturers, such as Cisco) and the amount of know-how required to operate them was limited.

But it’s 2016. There are hundreds of network device manufacturers. Every enterprise we encounter is using at least a dozen of them. Each device has a plethora of features and getting certified on just a single product can take months. However, NPM’s approach to data collection and graphing isn’t sufficient for today’s needs. It makes sense if you think about it – it’s very difficult to write a product from scratch to match the new requirements of the market.

indeni was written from scratch with today’s requirements in mind. That’s why indeni treats every network device very differently. The code and knowledge utilized by indeni to analyze a Cisco switch is vastly different to what is used with a Palo Alto Networks firewall. Even between firewall vendors, such as Check Point, Cisco, Fortinet, Juniper and Palo Alto Networks, the logic used by indeni to look for issues is unique to each product analyzed.

So, if you have Check Point firewalls, Cisco firewalls, Cisco routers, Cisco switches, F5 LTMs, Fortinet firewalls, Juniper firewalls and Palo Alto Networks firewalls, take a look at the top of this page and click on the product you are responsible for. You’ll see that out-of-the-box, with no additional work or configuration, indeni will uncovers thousands more issues than NPM is capable of uncovering for the devices supported by indeni. What’s more, indeni keeps expanding in the set of issues it can identify.

It’s 2016. It’s time to use 2016’s software.

If you want to see the difference between outdated technology you’re currently running and a predictive analysis software for high availability networks,

Fill out the form below:

 

[ninja_form id=16]

Firewall Connection Table Limit Approaching or Reached – Check Point Firewall Alerts

This is a real life sample alert from the indeni Check Point Firewall configuration guide. 

Description:

There are 248742 concurrent connections while the limit is 250000. The connection table limit should be increased to ensure uninterrupted operation.

Manual Remediation Steps:

Upgrading to the GAIA OS can resolve the need to set a connection table limit. If you decide to remain on IPSO, however, consider the following:

In many cases, a sudden spike in connections has been attributed to a worm or misbehaving application. If you have ruled this out, consider the following solutions:

  1. Locate the maximum concurrent connections setting for the firewall (normally found in the object’s properties) and increase the value. The increase should be done gradually and with care as it will also increase the memory usage of the firewall.
  2. Turn on Aggressive Aging to have connections removed as quick as possible.
  3. In the SmartDashboard, go to Policy->Global Properties and in the Stateful Inspection tab reduce the TCP end timeout to 5 seconds. Please refer to the firewall’s user manual for more information on what the TCP end timeout is.

How does this alert work?

indeni tracks the number of entries in the connections table, using “fw tab connections -s”.

indeni is Short-Listed for the 2015 Red Herring Top 100 North America Award

San Mateo, CA – May 18th, 2015 – indeni announced today it has been short-listed for Red Herring’s Top 100 North America award, a prestigious list honoring the year’s most promising private technology companies from the North American business region.

Red Herring has been selecting the most exciting and promising start-ups and “scale ups” since 1995. Finalists are still evaluated individually from a large pool of hundreds of candidates based across North America. Twenty major criteria underlie the scoring and process. They include, among others: the candidate company’s addressable market size, its IP and patents, its financing, the proof of concept, trailing revenues and management’s expertise. Each company goes through an individual interview after filling out a thorough submission, complemented by a due diligence process. The list of finalists often includes the best performing and prominent companies of that year.

“We are excited to be short-listed for the award”, commented Yoni Leitersdorf, indeni’s CEO and Founder. “2015 has been an amazing year for us and Red Herring’s recognition is greatly appreciated.”

Cutting Down On Alerting Noise: Response to SolarWinds’ Post

Recently a member of the support team at SolarWinds posted “Cutting Down On Alerting Noise: Guest Post From Support”.

The challenge of alert fatigue and noise is a big one. Many companies have attempted, and still are attempting, to solve this issue. BigPanda, is an example of such a company.

In the original post, three tips are detailed.

The first tip discusses custom properties – which allows you to control who gets alert and if they are sent at all. This is great and useful to ensure that whoever does get alerts, should get them.

The second tip instructs you to teach Orion what the dependencies are between your devices. This is something that I have personally rarely seen a user of SolarWinds NPM do.

The third tip though, is a bit more interesting. The ability to create custom conditions – essentially teaching Orion how to look at more than one parameter to provide you with a more interesting and actionable alert. The challenge here, though, is that it is up to the user to define these. How will the user think of what to define? How much effort will it take them? This is a great feature, which SolarWinds introduced in NPM 11.5, but I’d be interested to see who uses it and how.

The reason this post is so interesting to me, and us at indeni, is that it strikes a chord with what we’re doing. Our operating assumption is that alerts must be actionable and as few and far between as possible, to ensure alert fatigue doesn’t settle in. To do that, we need to factor in relationships between devices (see tip #2 above) and complex conditions that factor in dozens of parameters (see tip #3 above).

Our approach is proven – 97% of the alerts we issue get actioned immediately. That shows that alert noise isn’t a problem in our approach. I can tell you, however, that knowledge generation, the mechanism through which you determine relationships and complex conditions, is one. That’s the really tough nut to crack.

Thankfully, we’ve got smart guys working on that. 🙂

So – if you’re an Orion NPM user and are looking to get far deeper insight into devices made by Check Point, F5, Palo Alto Networks and others (see our supported technologies list) – you need to try indeni. One important feature about indeni is that it can replace your NPM for the specific devices indeni already supports, and you can have indeni’s alerts forwarded to directly to your ticketing system.

If you found this article interesting, you should sign up for our newsletter.

Fill out the form below. 

[ninja_form id=20]

Pulling Data via SNMP, SSH or API – PAN Firewall Best Practices

When querying a firewall, what’s the best protocol to use? SNMP, SSH or API?

If you are looking to integrate Palo Alto firewalls as part of some automated system – scripts, central NOC, software-defined-whatever, etc. – you’d want to hear what we have to share. You should also read this post if you like learning about interesting technical aspects of the products you use.

As you may know, we have started supporting Palo Alto Networks (PANW) firewalls in our product late last year. We are currently developing new support and are working with large and small organizations throughout the globe. One interesting thing we’ve noticed that’s worth sharing is that PANW’s customers are very open to embracing new technologies. That is great for us 🙂

Continue reading