Thursday 13 August 2015

Parse, TryParse, Convert

We've talked a lot about converting data between types, we talked about testing to see if our data was of a specific value or reference type using the "is" comparison, we used the "as" operator to convert one reference type into another nullable objects, we talked about implicit vs explicit casting.

Now the question is what do we do if we have a string that actually contains the value for a number or a date? well we can use the int.Parse() function or the DateTime.Parse() functions.


namespace pav.tryparse
{
    class Program
    {
        static void Main(string[] args)
        {
            var strInt = "123";
            int i = int.Parse(strInt);

            var strDate = "1/31/1984";
            var date = DateTime.Parse(strDate);
        }
    }
}


now that's great, but what if our string doesn't contain what we expect, let's for example say that we got our date from a European web service, that would return it in the dd/mm/yyyy format. well using the parse method in this fashion will throw a Format Exception or if we expected an int value of 123 but instead got the string "sdf".

to get around our problem we can use the tryParse methods


namespace pc.tryparse
{
    class Program
    {
        static void Main(string[] args)
        {
            var strInt = "123";
            int i;
            int.TryParse(strInt, out i);

            var strDate = "31/1/1984";
            DateTime d;
            DateTime.TryParse(strDate, out d);
        }
    }
}


this approach will first test to see if our string inputs represent the value of the types they want to be and if so then they are set to the out parameter.

Now there's also a static Convert class with static overloaded Convert functions

namespace pc.tryparse
{
    class Program
    {
        static void Main(string[] args)
        {
            var strInt = "123";
            int i = Convert.ToInt32(strInt);

            var strDate = "31/1/1984";
            DateTime d = Convert.ToDateTime(strDate);
        }
    }
}

these functions cover just about any standard conversion you'd want to make, however they introduce the Format Exception issue once again.

Monday 10 August 2015

Boxing, is, as

When you cast a value type as reference type this is called boxing, when you convert a reference type type to a value type this is called unboxing. Value types are stored on the stack, whereas reference types are stored on the heap and a reference to that heap value is stored on the stack. Think of the stack as a list, if something is small and simple it goes onto the list, if it's big and complex and doesn't fit on the list, we write down an address as to where we can find the complex thing on our list. 

Unboxing is where you can fall into some troubled waters; as long the type you're unboxing to is correct everything is fine, but should you get it wrong you'll get an invalidCastException during, which can suck.


namespace pav.boxing
{
    class Program
    {
        static void Main(string[] args)
        {
            // boxing
            object o = 32;

            // unboxing
            int i = (int)o;

            //throws System.InvalidCastException
            bool b = (bool)(o);
        }
    }
}


Luckily we have the "is" comparison, which we can do on reference types such as object allowing us to test and see if our reference type is in fact the value type we want to convert to.


namespace pav.boxing
{
    class Program
    {
        static void Main(string[] args)
        {
            // boxing
            object o = 32;

            if (o is bool)
            {
                //this will never fire because our object is not a bool
                var b = (bool)o;
            }
        }
    }
}


The "is" comparison also works when comparing a reference type to another reference type,


namespace pav.boxing
{

    class Person { }

    class Employee : Person { }

    class Program
    {
        static void Main(string[] args)
        {
            object o = 32;

            if (o is int) //true
                Console.WriteLine("o is an int");
            else
                Console.WriteLine("o is not an int");

            if (o is Person) //false
                Console.WriteLine("o is a person");
            else
                Console.WriteLine("o is not a person");

            var p = new Person();
            var e = new Employee();

            if (e is Person) // true
                Console.WriteLine("e is a Person");
            else
                Console.WriteLine("e is not a person");

            if (p is Employee) // false
                Console.WriteLine("p is an employee");
            else
                Console.WriteLine("p is not an employee");
        }
    }
}


Not only can you use it to identify if your variable is of a particular type but also if it's of a base type, in our example, our instance of Employee e is a Person but our instance of Person p is not an Employee. Keep in mind that when casting between reference types, this is not considered 'boxing' or 'unboxing' these terms are reserved for value types to reference types and vice-versa. When converting between reference types it is considered 'upcasting' or 'downcasting' this is because at no point is new memory allocated. 

Let's take a quick look at the "as" operator, the as operator lets us convert variables into nullable types.


static void Main(string[] args)
{
    object o = 32;
   
    //build time error, not allowed because int is not nullable
    var i = o as int;
}

since value types are not nullable, they can't be converted by using the "as" operator. however reference types are nullable.


namespace pav.boxing
{
    class Person { }

    class Employee : Person { }

    class Program
    {
        static void Main(string[] args)
        {
            object o = new Employee();

            var e = o as Employee;
            if (e != null)
                Console.WriteLine("o is an employee");

            var p = o as Person;
            if (p != null)
                Console.WriteLine("o is a person");

            e = new Person() as Employee;
            if (e != null)
                Console.WriteLine("e is an employee");
            else
                Console.WriteLine("e is not an employee");
        }
    }
}


so when we successfully convert a type using the as operator we receive our target type, however is the conversion fails we get a null. A subtype can be converted into a base type, but not vice-versa.

To sum it up, in C#, 'boxing' is to the process of converting a value type (such as an int or a struct) to a reference type (such as an object). This is done by creating a new object and copying the value of the value type into the new object. 'Unboxing' is the opposite process, where a reference type (such as an object) is converted back to a value type. This is done by copying the value stored in the object back into a value type variable.

Boxing and unboxing can have a performance cost, as they involve creating and manipulating objects in memory. Therefore, it is generally recommended to avoid unnecessary boxing and unboxing in performance-critical code.

One thing to keep in mind is that technically since strings and classes are already reference types, they cannot be 'boxed'. Only value types such as int, float, structs, etc can be 'boxed', this is because they are stored on the stack and not the heap like reference types.

When you try to box a reference type, it will just return a reference to itself and does not create a new object. However, even though reference types cannot be boxed, reference types can be stored as objects, just keep in mind that this does not the qualify as 'boxing'. When casting a reference type such as a string or a class to an object this is considered downcasting, whereas converting an object back to a string or class is considered upcasting.

Wednesday 5 August 2015

Widening & Narrowing Conversions

Widening and narrowing conversions refer to the conversion of one data type to another data type that has a larger or smaller range of values. Everything in an application is stored as a bit, that is either a '1' or a '0' an "On" or "Off" however you prefer to think of it, those bits are  interpreted to accomplish everything to do with a computer from booting up, to shutting down, and everything in-between. Luckily we don't have to worry about such granular details.

Different value types are represented by different amounts of bits, for example let's say you cast a short which is made up of 16 bits (2^16) to an int which has 32 bits (2^32) well that would be like pouring a shot into a pint glass, it's going to work every time 100% of the time which is why it doesn't need an explicit cast.


short myShort = 32767;

//implicit cast
int myInt = myShort;

//outputs 32767
Console.WriteLine(myInt);


however the opposite isn't true, if we cast an int into a short we need the explicit cast because we have the danger of an overflow, trying to put more bits into a container than can fit.

   
    static void Main(string[] args)
    {
        short myShort = 32767;

        //implicit cast
        int myInt = myShort;

        //outputs 32767
        Console.WriteLine(myInt);

        //explicit cast
        myShort = (short)myInt;
       
        //outputs 32767
        Console.WriteLine(myInt);
    }


In the case above we're still in the clear because we've basically poured a shot into a pint glass and then back into a shot glass.

But what if we were to add 1 to our short when it's in the pint glass before pouring it back, into the shot glass.


class Program
{
    static void Main(string[] args)
    {
        short myShort = 32767;

        //implicit cast and increase by 1
        int myInt = myShort + 1;

        //outputs 32767
        Console.WriteLine(myInt);

        //explicit cast
        myShort = (short)myInt;

        //outputs -32767
        Console.WriteLine(myShort);
    }
}


Well we'd have spillage, but something strange has happened, why on earth is our short a negative number? well remember when I said that everything is represented by bits, well let's take a look at what our initial value looks like in binary.

32767 in hexadecimal is 7FFF which in binary is 0111 1111 1111 1111
by adding just 1 to our value we transform our number into
32768 in hexadecimal is 8000 which in binary is 1000 0000 0000 0000

The reason why we get a value of -32768 is because the first bit signifies if the value is negative or positive, so if it's 1 then the number is negative however it still uses all 16 bits for the value.


class Program
{
   static void Main(string[] args)
        {
            short myShort = 32767;

            //+32767 = 0111 1111 1111 1111 binary
            Console.WriteLine($"+{myShort} = 0{Convert.ToString(myShort, 2)} binary");

            //implicit cast and increase by 1
            int myInt = myShort + 1;

            //+32768 = 1000 0000 0000 0000 binary
            Console.WriteLine($"+{myInt} = {Convert.ToString(myInt, 2)} binary");

            //explicit cast
            myShort = (short)myInt;

            //-32768 = 1000 0000 0000 0000 binary
            Console.WriteLine($"{myShort} = {Convert.ToString(myShort, 2)} binary");
        }
}


anyway that's a nice little tangent into how the magic works, but let's not concern ourselves too much about that, our problem is that when we convert with a narrowing conversion and have spillage we are non the wiser, this could result in some serious buggage, luckily we can wrap explicit conversions in a checked block which will force an overflow exception when an overflow occurs.


namespace pav.WideNarrow
{
    class Program
    {
        static void Main(string[] args)
        {
            short myShort = 32767;
            Console.WriteLine($"+{myShort} = 0{Convert.ToString(myShort, 2)} binary");

            //implicit cast and increase by 1
            int myInt = myShort + 1;
            //outputs 32767
            Console.WriteLine($"+{myInt} = {Convert.ToString(myInt, 2)} binary");

            //explicit cast in checked block
            checked
            {
                //will throw an overflow exception when appropriate
                myShort = (short)myInt;
            }
        }
    }
}


To wrap up when you go from a narrow scope to a wide one you are performing a widening conversion, otherwise known as upcasting. This involves converting a data type with a smaller range of bits to one with a larger range of bits. These conversions are considered safe and typically implicit, meaning that they do not need an explicit cast. 

When you go from a wide scope to a narrow one, this is called a narrowing conversion, also known as downcasting. Downcasting or a narrowing conversion involve converting a data type with a larger range of bits to one with a smaller range of bits, potentially losing data. These conversions must be explicit and require the use of casting.

It's important to keep in mind that while upcasting is generally safe, downcasting is not always possible and can result in unexpected behavior, but if wrapped in a checked block can be caught with a overflow exception, if the value being casted is not within the range of the target data type.

Saturday 1 August 2015

Implicit vs Explicit Conversion

Implicit conversation, refers to the automatic conversion of one data type to another without the use of explicit casting. For example if you're casting from a short (16 bit) to an int (32 bit) since short has fewer bits allocated to it than an int, you know for certain that the short will fit into the int and now explicit cast is required.

When it comes to the other way around, when you try to fit an int into a short it's a different story. If the value of the int is greater than what the short can store you're in for some surprises, but we'll talk about those later.


namespace pc.converstion
{
    class Program
    {
        static void Main(string[] args)
        {
            //Narrowing Conversion requires Explicit cast
            int myInt = 133;
            short myShort = (short)myInt;

            //Widening Conversion may have an implicit cast
            short myShort2 = 34;
            int myInt2 = myShort2;

            //but there's nothing wrong with having an Explicit cast
            int myInt3 = (int)myShort2;

            //event though a bool should fit into a short or int,
            //explircity or implicity this is not allowed
            bool myBool = true;
            short failedExplicit = (short)myBool;
            short failedImplicit = myBool;

            //a workaround is cast 1 or 0 as a short or byte
            short workaround = myBool ? (short)1 : (byte)0;

            //works just as expected.  
            int workaround2 = myBool ? 1 : 0;
        }
    }
}


In conclusion if you're converting from a smaller value type to a larger one you shouldn't have any issues; except for bool's, bool's can't be cast implicitly or explicitly to other value types. I mention this because in javascript and other languages, true is synonymous with 1 and false with 0. 

however a reasonable workaround is to assign 1 if your boolean value is true and 0 if it is false, remember that there is no literal suffix for shorts or bytes, meaning that you have to explicitly cast your numeric values to them.

In short both explicit and implicit conversation can be used for data type conversion in C#, but explicit conversation is generally used when there is a potential for data loss during the conversion, while implicit conversation is used when there is no risk of data loss.