The ability to identify the location of a sound source is a core auditory ability for many daily purposes [1]. Our ability to accurately localize sounds depends on coding, by neurons in the Central Nervous System, of various cues to the location of the sounds. For on-going high frequency sounds, the major cue for azimuthal location of the sound source is the difference in intensity/level (formerly Interaural Intensity Differences, now Interaural Level Differences; IIDs/ILDs) [2]. ILDs are the difference in sound levels at the two ears as a sound source moves about an animal and are created by head and body shadowing effects which affect high frequency sounds more than low frequency sounds [3]. There is a vast literature on the importance of ILDs and how neurons at various brain levels respond to ILDs that cover a wide azimuthal range across frontal space, from opposite one ear across to opposite the other. In mammals this cue is first functionally coded by neurons in the auditory brainstem, and then relayed to the Inferior Colliculus (IC), but it is clear that in some species at least (including the rat studied here), ILD sensitivity is also created *de novo* in many IC neurons [4].

Different IC neurons appear to use different combinations of interactions between excitatory and inhibitory inputs to code ILDs (a set of neuronal operations that also appears to be used in auditory cortex), [5] producing a diversity of forms of ILD sensitivity in neurons in the one auditory structure; this diversity argues against using a single network model to describe all the different forms of ILD sensitivity.

### Introduction to data normalization

Data normalization is a scaling process for numbers in a data array and is used where a great heterogeneity in the numbers renders difficult any standard statistical analysis. The data is often normalized before any application process and therefore data normalization is usually termed as data pre-processing. Many different data normalization techniques have been developed in diverse scientific fields, e.g. in statistical analysis for applications such as in diagnostic circuits in electronics [6], temporal coding in vision [7], predictive control systems in seismic activities [8], modeling Auditory Nerve stochastic properties [9], modeling labor market activity [10], pattern recognition [11], and most extensively in microarray data analysis in genetics, [12–20].

The need for data normalization is determined by the user and depends on the application. Thus the purpose of data normalization depends on the proposed application, and includes use of linear scaling to compress a large dynamic range [6], scaling of values to correct for variation in laser intensity [18], handling obscure variation [12] or removing systematic errors in data [11, 15, 17, 20], or efficiently removing redundancy in a non-linear model as an optimal transformation for temporal processing [7]. Although the benefits of data normalization depend on data type, data size and normalization method (which can vary between different fields), generally the advantages of data normalization are (a) to give a more meaningful range of scaled numbers for use, (b) to rearrange the data array in a more regular distribution, (c) to enhance the correctness of the subsequent calculations, and (d) to increase the significance or importance of the most descriptive numbers in a non-normally distributed data set.

### Introduction to data dimension reduction technique

Principal component analysis (PCA) is a statistical tool to reduce the dimensions of a large data set for the purpose of data classification when a data set can be described in a number of different ways, or described by a number of different variables (such as slope steepness, cut-off position, peak location, maximum firing rate etc.), and is therefore said to possess many dimensions. Such data becomes difficult to classify because it is often not known which of these dimensions are the most important or, indeed, if only one of them is the most important. In such a case, some means has to be devised in order to reduce the dimensions in the data set to a single dimension across all the data. This single dimension can then be used to differentiate between sub-groups within the overall data set. PCA is a powerful statistical tool that does precisely this.

The PCA is used as an independent statistical method for data classification to handle both metric and multivariable types of data [21]. In the PCA, the data variables are largely dependent on one another; in fact, if data were not correlated then principal components would not be suitable for data dimension reduction. *Barlett’s Sphericity Test* can be used to verify the appropriate conditions for the data [22], but the details of this test are beyond the scope of this manuscript.

### Introduction to cluster analysis

Data classification is a way of segregating similar types of data groups into homogenous clusters. Each of these compact data groups contains a number of data-elements with comparable characteristics. In data classification studies, two methods are generally used to distinguish the classified data, namely: *Supervised* (discriminant analysis) and *Unsupervised* (data clustering) classification [23].

Data characterization can be planned as a two-step procedure consisting of the combination of PCA for reduction of data dimensions followed by *Cluster Analysis* for grouping similar types of data objects. This technique has been widely used in several different types of applications in a diverse range of scientific fields including in crime analysis [24], in finding the relationship between retention parameters and physiochemical parameters of barbiturates [25], in chemo-metric methods in characterizing steel alloy samples [26], in drug design [27], in isolating single unit activities for data acquisition [28], and in microarray based gene identification [29, 30]. This combined technique has been reviewed by [31] for several clustering algorithms, and they have emphasized the importance of applying PCA prior to Cluster Analysis for high dimensional data.