0

Use a Pointer class in C#? The more you know about reference and value types!

Hello again dear readers!

This time let’s settle for a post in the category of „good to know“.

Today we are going to write a pointer class in C#!
„But Alexander, that’s absolutely useless, C# classes are always reference types and we do have the ref and outparameters!“

Haha! That might be true, but there are situations where these things are not enough! Humor me for a while, and let’s dig right in.

Value and Reference types in C# in comparison to C++

So why pointers anyway? In C++ pretty much anything is a value type. This means, that if I instantiate a and b, then assign b to a, both variables will hold the contents of b, however, b will have been copied. So if I change the contents of a, which now contains a copy of b, I will not change the original b.

There are many many situations where this behavior is highly counterproductive . In C++ you can use references and pointers to avoid this, which is a way to resolve memory addresses, not unlike like street addresses, and thus being able to access the same value, from different variables. The value is a reference type, it points at the address where the actual value is stored.

C# and many other languages take this to the next level. Any class in C# is a reference type. Pointers do not exist explicitly, but we can handle value types like ints or bools or any struct like a reference using the ref and out keywords when passing parameters.
Let’s have a look.

public static void IntChange(int i)
{
    ++i;
}

public static void RefIntChange(ref int i)
{
    ++i;
}

public static void OutIntChange(out int i, int iPrevious)
{
    i=++iPrevious;
}

[...snip...]

int ourInt = 10;

IntChange(ourInt);
Console.WriteLine(ourInt); // output: 10

RefIntChange(ref ourInt);
Console.WriteLine(ourInt); // output: 11

OutIntChange(out ourInt, 3);
Console.WriteLine(ourInt); // output: 4

As you can see, the method ‚IntChange‘ does not alter our integer outside of the functions scope, because integers are a value type. However, using the ref keyword passes a reference to that integer, which allows us to change it outside the method’s scope.

The out parameter is a bit special. Values passed with the out parameter have to be initialized inside of the method they are passed to. Usually this is used to enforce a certain kind of behavior, for example, when passing buffers to methods where we want the method to handle all the buffer’s properties.
So. We are good, right?

Not quite…

There are some special cases where C# either does not allow the usage of out and ref parameters (e.g. enumerators or lambda expressions) and sometimes even enforces shallow copies of the passed parameters.

In all honesty, the whole value-type/reference-type thing is a topic I used to torture students with to no end. There are many resources and exercises and the web to master this, as it’s absolutely fundamental, so should you be a beginner and not yet have a firm grasp on this, the rest of the post might be a little confusing at first.

A pointer class… in C#…? Should we?

I once encountered a very strange bug. I wrote a function that received a list of classes and performed „in place“ operations on said list. I’ve done this dozens of times in other environments so I thought I was all set.
However, the list never updated outside the function’s scope. I figured that maybe, because this was a special kind of invoke that was not invoked directly by me, but by a proxy function of the framework, the process enforced some parameter copies or something like that and yes, that appeared to be the case. The framework method was indeed making a shallow copy of all passed parameters before passing them on again!
It was no surprise then, that trying to pass my lists as ‚out‘ or ‚ref‘ failed with an error thrown by the framework. These have of course to be defined explicitly by the framework as separate method overloads. Which of course it did not.

A quick search on the net matched me with a few dozen other puzzled developers. To my relief, an old hat at Fortran provided a helpful answer to that problem. Apparently back in the Fortran days they wrote their pointers as arrays, I’ve never laid eyes on Fortran code myself so I have no idea, but of course wrapping the whole thing in yet another reference type worked, since the framework in question handled it’s inputs as „shallow copies“.

 

Tricking a Shallow Copy, retaining a Reference Type!

So what exactly is a shallow copy anyway? Imagine the following: You have a box with a few pieces of paper in it. On each piece of paper you write the address of a friend. Let’s shallow copy that box now. We take another box and some more paper, transcribe all the addresses and put them into the second box. Congratulations, you now have two boxes with exactly the same contents. However, you did not copy the houses of your friends, the addresses written on those pieces of paper still point to the very same house. But if I change one of the addresses in box B, box A will still have the old address‘.

Now, I want someone to sort through my box (A) for me. It won’t do me any good to give box B to that person. Because my box A would stay the same. Instead, I’ll give him a piece of paper, where I describe exactly where box A is. He copies that for future reference and can then find box A to sort through my stuff.

An array is essentially that description here, which is why writing the array workaround worked. But since that’s weird, let’s take a look at a generic approach to a reference wrapper.

using System;
using System.Reflection;

namespace PtrPost
{
    /// <summary>
    /// Generic Baseclass for a simulated pointer in C# to use value types as references when the
    /// out or ref keywords are not available or reference deep copies are enforced in the scope the class 
    /// is used.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    public class Ptr<T>
    {
        T value;

        public T Value
        {
            get
            {
                return value;
            }

            set
            {
                this.value = value;
            }
        }

        /// <summary>
        /// Creates the pointer object and will optionally try to instantiate the object with a default value or by calling a 
        /// parameterless standard constructor. Will throw an error if instantiation fails (Instantiation was wanted but
        /// the provided class does not have a parameterless standard constructor)
        /// Use this constructor for classes with a default constructor or value types.
        /// </summary>
        /// <param name="instantiate">Try to instantiate object</param>
        public Ptr(bool instantiate = true)
        {
            if (!instantiate) return;

            // will set the value to the default types value (e.g. int = 0)
            value = default(T);

            // reference Type
            if (value == null)
            {
                // string is a special type and does not have an accessible constructor, so enforce
                // a standard value of '""' here.
                if (typeof(string).IsAssignableFrom(typeof(T)))
                {
                    value = (T)Convert.ChangeType("",typeof(T));
                    return;
                }
                // For everything else, try calling a standard constructor
                ConstructorInfo ctor = typeof(T).GetConstructor(new Type[] { });
                if (ctor != null)
                {
                    value = (T)Convert.ChangeType(ctor.Invoke(new object[] { }), typeof(T));
                }
                else
                {
                    throw new Exception("Could not find a default constructor with 0 arguments for Ptr<T> instance, type: " + typeof(T));
                }
            }
        }

        /// <summary>
        /// Creates the pointer object and will try to instantiate it using the provided parameters as an identification and arguments for a constructor.
        /// Use this method to create and instantiate pointers for classes without a parameterless standard constructor
        /// </summary>
        /// <param name="ctorTypes"></param>
        /// <param name="args"></param>
        public Ptr(Type[] ctorTypes, object[] args)
        {
            if(ctorTypes.Length != args.Length)
            {
                throw new Exception("Argument Type-List and argument list are incompatible (Count Mismatch)");
            }
            for (int i = 0; i < ctorTypes.Length; i++)
            {
                if (ctorTypes[i] != args[i].GetType())
                {
                    throw new Exception("Argument Type-List and argument list are incompatible (Type Mismatch)");
                }
            }

            ConstructorInfo ctor = typeof(T).GetConstructor(ctorTypes);
            if (ctor != null)
            {
                value = (T)Convert.ChangeType(ctor.Invoke(args), typeof(T));
            }
            else
            {
                throw new Exception("Could not find a default constructor with provided argumenttypes for Ptr<T> instance, type: " + typeof(T) + "\nor argument list was incompatible.");
            }
        }
    }
}

The basic idea behind this class is, that we try to always have an instantiated value behind the ‚Value‘ property. When hitting a shallow copy enforced space, the pointer class get’s copied, however, everything inside stays the same. Adding this extra layer prevents our reference from being lost and allows us to change the value property from multiple places.

In order to ensure an initialized state of the value property we use Reflection, which actually reminds me that I really need to make a post about reflection and generics.

Anyway, the basics: We try to assign a default value to our property, if this results in null, we know that we deal with a class, so we have to call a constructor for initalization. We get a list of constructors, and depending on which constructor we used for our Ptr Class, we try to match our argument lists and pass parameters.The key-functions here are Type.GetConstructor, Convert.ChangeType and Type.IsAssignableFrom.

GetConstructor retrieves information about the generic objects constructors, so we can actually find out if and how we can initialize our object..

ChangeType enforces a type conversion of a specific type to a generic type.

And finally, IsAssignableFrom checks if one type can be assigned to another type, which is a requisite for the conversion ChangeType attempt.

Usage:

public static void pIntChange(Ptr<int> pInt)
{
    ++pInt.Value;
}

[...snip...]

Ptr<int> pInt = new Ptr<int>(); // default to value == 0
pIntChange(pInt);
Console.WriteLine(pInt.Value); //output: 1

Ptr<Test> pTestClass = new Ptr<Test>();
pTestClass.Value.Printself(); // output: Test true

Ptr<Test> pTestClass2 = new Ptr<Test>(new Type[] 
{
    typeof(string), typeof(bool) }, 
    new object[] { "Test2" , false
});
pTestClass2.Value.Printself(); // output: Test2, false

But.. should we really?

Now, there is a reason why you won’t see this kind of code often in well made C# applications. It is, essentially, a prettily written hack. In a well designed system you should never have to use a work around like this, because, if you knew how your programm’s scopes have to work and how your data flows, you wouldn’t build in limitations like the ones I described above in the first place.

Perhaps you should treat this more as an exercise in value and reference types. However, if you ever ecounter limitations as described above, and you absolutely have to break them, you now know how to do it with style.

Thanks for your time and see you in the next post!

Alexander

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.